Equation Chapter 1 Section 1 Regression in Meta-Analysis Michael Borenstein Larry V. Hedges Julian P.T. Higgins Hannah Rothstein Draft – Please do not quote This draft released January 6, 2015 Please send comments to [email protected] An updated copy of this manual will be posted at http://www.meta-analysis.com/pages/cma_manual.php 1 Part 1: Data files and downloads ................................................................................................................ 13 Part 2: Overview of Meta-regression .......................................................................................................... 14 Part 3: The BCG example ............................................................................................................................ 15 Part 4: Meta-regression is observational .................................................................................................... 19 Part 5: Fixed-effect vs. Random-effects ...................................................................................................... 21 Putting regression in context .................................................................................................................. 29 Fixed-effect model .................................................................................................................................. 30 Basic analysis (Case A)......................................................................................................................... 30 The traditional approach ................................................................................................................ 30 Test of effect size ........................................................................................................................ 31 Test of the statistical model ........................................................................................................ 31 The regression approach................................................................................................................. 31 Test of effect size ........................................................................................................................ 32 Analysis of variance..................................................................................................................... 32 Summary ......................................................................................................................................... 33 Subgroups analysis (Case B) ................................................................................................................ 33 The traditional approach ................................................................................................................ 33 Is the common effect size zero for each subgroup? ................................................................... 34 Analysis of variance..................................................................................................................... 35 The regression approach................................................................................................................. 36 Analysis of variance..................................................................................................................... 37 Summary ......................................................................................................................................... 39 Continuous covariate (Case C) ............................................................................................................ 39 The traditional approach ................................................................................................................ 39 The regression approach................................................................................................................. 39 Analysis of variance..................................................................................................................... 40 Summary ......................................................................................................................................... 41 In context ............................................................................................................................................ 41 Random-effects model............................................................................................................................ 42 Basic analysis (Case A)......................................................................................................................... 42 The traditional approach ................................................................................................................ 42 Test of effect size ........................................................................................................................ 43 2 Heterogeneity ............................................................................................................................. 43 The regression approach................................................................................................................. 44 Test of effect size ........................................................................................................................ 45 Test of the model ........................................................................................................................ 45 Goodness of fit ............................................................................................................................ 46 Comparison of Model 1 with the null model .............................................................................. 47 Summary ......................................................................................................................................... 48 Subgroups analysis (Case B) ................................................................................................................ 48 The traditional approach ................................................................................................................ 49 Note on computing T2 ................................................................................................................. 50 Is the mean effect size zero for each subgroup? ........................................................................ 51 Test of the model ........................................................................................................................ 51 Heterogeneity ............................................................................................................................. 51 The regression approach................................................................................................................. 52 Test of the model ........................................................................................................................ 54 Goodness of Fit ........................................................................................................................... 54 Comparison of Model 1 with the null model .............................................................................. 56 Summary ......................................................................................................................................... 58 Continuous covariate (Case C) ............................................................................................................ 58 The traditional approach ................................................................................................................ 58 The regression approach................................................................................................................. 59 Prediction equation .................................................................................................................... 60 Test of the model ........................................................................................................................ 60 Goodness of fit ............................................................................................................................ 60 Comparison of Model 1 with the null model .............................................................................. 62 Summary ......................................................................................................................................... 63 In context ............................................................................................................................................ 64 Part 6: Meta-regression in CMA.................................................................................................................. 65 What’s new in this version of meta-regression? .................................................................................... 66 The covariates and the predictive model ............................................................................................... 67 Quick Start............................................................................................................................................... 68 Step 1: Enter the data ............................................................................................................................ 69 Insert column for study names ........................................................................................................... 69 Insert columns for effect size data...................................................................................................... 70 Insert columns for moderators (covariates) ....................................................................................... 74 3 Customize the screen.......................................................................................................................... 79 Enter the data ..................................................................................................................................... 82 Step 2: Run the basic meta-analysis ...................................................................................................... 83 The main analysis screen .................................................................................................................... 84 The initial meta-analysis ..................................................................................................................... 84 Display moderator variables ............................................................................................................... 86 Display statistics .................................................................................................................................. 88 Step 3: Run the meta-regression ............................................................................................................ 89 The Interactive Wizard ........................................................................................................................ 90 Add covariates to the model ............................................................................................................... 91 Set computational options.................................................................................................................. 93 Run the regression .............................................................................................................................. 94 Step 4: Navigate the results .................................................................................................................... 95 Main results screen (Fixed effect)....................................................................................................... 95 Main results screen (Random effects) ................................................................................................ 96 Difference between the fixed-effect and random-effects displays .................................................... 97 Plot ...................................................................................................................................................... 98 Other screens ...................................................................................................................................... 99 Step 4: Save the analysis ....................................................................................................................... 100 Step 5: Export the results...................................................................................................................... 102 Part 7: Understanding the results ............................................................................................................. 104 Main results .......................................................................................................................................... 105 Main results, fixed-effect analysis ........................................................................................................ 106 Test of the model ...................................................................................................................... 108 Analysis of variance................................................................................................................... 108 Summary ................................................................................................................................... 112 Main results, random-effects analysis .................................................................................................. 113 Test of the model [D] ................................................................................................................ 117 Goodness of fit [E]..................................................................................................................... 117 Comparison of Model 1 with the null model ............................................................................ 122 Summary ................................................................................................................................... 123 Diagnostics ............................................................................................................................................ 125 Covariance............................................................................................................................................. 132 Correlations........................................................................................................................................... 133 Increments ............................................................................................................................................ 134 4 Part 8: The R2 index ................................................................................................................................... 145 The schematic for R2 ............................................................................................................................. 149 A seeming anomaly ............................................................................................................................... 150 Assessing change in the model ............................................................................................................. 152 Understanding I2 ....................................................................................................................................... 156 Part 9: Working with the plot ................................................................................................................... 159 Confidence interval and prediction interval ......................................................................................... 166 Part 10: Computational options................................................................................................................ 177 Knapp-Hartung vs. Z .............................................................................................................................. 178 One-point or simultaneous confidence intervals for graph ................................................................. 187 Options for estimating τ2 (MM, ML, REML) .......................................................................................... 189 One-sided vs. two-sided tests ............................................................................................................... 191 Part 11: Categorical covariates ................................................................................................................ 192 Dummy variables .............................................................................................................................. 193 Part 12: When does it make sense to omit the intercept ......................................................................... 205 The example ...................................................................................................................................... 205 Interpreting the results ..................................................................................................................... 207 Part 13: Working with “Sets” of covariates .............................................................................................. 211 Defining a “Set” ..................................................................................................................................... 211 How to create a Set............................................................................................................................... 214 How to remove a set ............................................................................................................................. 216 Part 14: Interactions and curvilinear relationships................................................................................... 217 Interaction of two categorical covariates ............................................................................................. 220 Interaction of a categorical covariate with a continuous covariate ..................................................... 227 Interaction of two continuous covariates ............................................................................................. 234 Curvilinear relationships ....................................................................................................................... 240 Part 15: Missing data ................................................................................................................................ 244 Part 16: Filter studies ................................................................................................................................ 254 Part 17: Defining several models .............................................................................................................. 262 Part 18: Complex data structures ............................................................................................................. 273 Independent subgroups within studies ................................................................................................ 273 Using subgroup as the unit of analysis ............................................................................................. 274 Using study as the unit of analysis .................................................................................................... 277 Multiple outcomes or time-points ........................................................................................................ 282 Multiple comparisons ........................................................................................................................... 292 5 Part 19: Some caveats............................................................................................................................... 293 Statistical power for meta-regression .................................................................................................. 294 Multiple comparisons ........................................................................................................................... 295 Part 20: Technical Appendix ..................................................................................................................... 297 Appendix 1: The dataset ...................................................................................................................... 298 Appendix 2: Understanding Q ............................................................................................................... 302 Appendix 3: Tests of heterogeneity ...................................................................................................... 318 Appendix 4: Computing τ2 in the presence of subgroups ..................................................................... 327 Appendix 5: Creating variables for interactions ................................................................................... 331 Appendix 6: Plotting a curvilinear relationship..................................................................................... 334 Appendix 7: Plotting interactions ......................................................................................................... 338 Plotting the interaction of two categorical covariates ..................................................................... 339 Plotting the interaction of a categorical covariate by a continuous covariate ................................. 343 Plotting the interaction of two continuous covariates ..................................................................... 347 Appendix 8: Interpreting regression coefficients ................................................................................. 352 Appendix 9: Meta-regression in stata .................................................................................................. 353 References ................................................................................................................................................ 358 6 Figure 1 | Basic analysis | Random effects | Risk ratio .............................................................................. 15 Figure 2 | Data-entry screen ....................................................................................................................... 16 Figure 3 | Basic analysis | Random effects | Log risk ratio ........................................................................ 18 Figure 4 | Regression of log risk ratio on latitude | Fixed-effect................................................................ 27 Figure 5 | Regression of log risk ratio on latitude | Random-effects ......................................................... 27 Figure 6 | Basic analysis | Fixed-effect | Log risk ratio ............................................................................... 30 Figure 7 | Basic analysis | Fixed-effect | Log risk ratio ............................................................................... 31 Figure 8 | Regression setup | Intercept only .............................................................................................. 31 Figure 9 | Regression | Main results | Fixed-effect | Intercept only ......................................................... 32 Figure 10 | Subgroups Cold vs. Hot | Fixed-effect...................................................................................... 34 Figure 11 | Subgroups Cold vs. Hot | Fixed-effect...................................................................................... 35 Figure 12 | Regression Cold vs. Hot | Setup ............................................................................................... 36 Figure 13 | Regression Cold vs. Hot | Fixed-effect ..................................................................................... 37 Figure 14 | Regression | Latitude | Setup .................................................................................................. 39 Figure 15 | Regression | Latitude | Fixed-effect ........................................................................................ 40 Figure 16 | Basic analysis | Log risk ratio | Random-effects ...................................................................... 42 Figure 17 | Basic analysis | Log risk ratio | Random-effects ...................................................................... 43 Figure 18 | Regression | Intercept | Setup................................................................................................. 44 Figure 19 | Regression | Intercept | Main results | Random-effects......................................................... 45 Figure 20 | Dispersion of effects about grand mean .................................................................................. 47 Figure 21 | Subgroups Cold vs. Hot | Random-effects ............................................................................... 49 Figure 22 | Subgroups Cold vs. Hot | Random-effects ............................................................................... 50 Figure 23 | Option for computing T2 in the presence of subgroups ........................................................... 50 Figure 24 | Option for computing T2 in the presence of subgroups ........................................................... 51 Figure 25 | Regression | Climate | Setup ................................................................................................... 53 Figure 26 | Regression | Climate | Main results | Random-effects ........................................................... 53 Figure 27 | Dispersion of effects about the subgroup means .................................................................... 56 Figure 28 | Dispersion about grand mean vs. dispersion about subgroup means ..................................... 57 Figure 29 | Regression | Latitude | Setup .................................................................................................. 59 Figure 30 | Regression | Latitude | Main results | Random-effects .......................................................... 59 Figure 31 | Dispersion of effects about regression line for latitude........................................................... 61 Figure 32 | Dispersion about grand mean vs. dispersion about regression line ........................................ 63 Figure 33 | Data-entry | Step 01................................................................................................................. 69 Figure 34 | Data-entry | Step 02................................................................................................................. 69 Figure 35 | Data-entry | Step 03................................................................................................................. 70 Figure 36 | Data-entry | Step 04................................................................................................................. 70 Figure 37 | Data-entry | Step 05................................................................................................................. 71 Figure 38 | Data-entry | Step 06................................................................................................................. 72 Figure 39 | Data-entry | Step 07................................................................................................................. 73 Figure 40 | Data-entry | Step 08................................................................................................................. 73 Figure 41 | Data-entry | Step 09................................................................................................................. 74 Figure 42 | Data-entry | Step 10................................................................................................................. 75 Figure 43 | Data-entry | Step 11................................................................................................................. 76 Figure 44 | Data-entry | Step 12................................................................................................................. 77 Figure 45 | Data-entry | Step 13................................................................................................................. 78 Figure 46 | Data-entry | Step 14................................................................................................................. 79 Figure 47 | Data-entry | Step 15................................................................................................................. 80 Figure 48 | Data-entry | Step 16................................................................................................................. 81 7 Figure 49 | Data-entry | Step 17................................................................................................................. 81 Figure 50 | Data-entry | Step 18................................................................................................................. 82 Figure 51 | Data-entry | Step 19................................................................................................................. 83 Figure 52 | Basic analysis | Fixed-effect | Log risk ratio ............................................................................. 84 Figure 53 | Basic analysis | Random-effects | Log risk ratio ...................................................................... 85 Figure 54 | Basic analysis | Display moderators ......................................................................................... 86 Figure 55 | Basic analysis | Display moderators ......................................................................................... 86 Figure 56 | Basic analysis | Display moderators ......................................................................................... 87 Figure 57 | Basic analysis | Display statistics for heterogeneity ................................................................ 88 Figure 58 | Run regression | Step 01 .......................................................................................................... 89 Figure 59 | Run regression | Step 02 .......................................................................................................... 90 Figure 60 | Run regression | Step 03 .......................................................................................................... 91 Figure 61 | Run regression | Step 04 .......................................................................................................... 92 Figure 62 | Run regression | Step 05 .......................................................................................................... 93 Figure 63 | Run regression | Step 06 .......................................................................................................... 94 Figure 64 | Main results | Fixed-effect ....................................................................................................... 95 Figure 65 | Main results | Random-effects ................................................................................................ 96 Figure 66 | Plot ........................................................................................................................................... 98 Figure 67 | Other screens ........................................................................................................................... 99 Figure 68 | Save analysis ........................................................................................................................... 100 Figure 69 | Export results ......................................................................................................................... 102 Figure 70 | Export results ......................................................................................................................... 103 Figure 71 | Setup ...................................................................................................................................... 106 Figure 72 | Main results | Fixed-effect ..................................................................................................... 107 Figure 73 | Plot | Fixed-effect................................................................................................................... 109 Figure 74 | Plot | Year | Fixed-effect ........................................................................................................ 110 Figure 75 | Plot | Latitude | Fixed-effect.................................................................................................. 111 Figure 76 | Run regression | Setup ........................................................................................................... 113 Figure 77 | Main results | Random-effects .............................................................................................. 114 Figure 78 | Main results | Random-effects .............................................................................................. 115 Figure 79 | Dispersion of effects about regression line for latitude......................................................... 118 Figure 80 | Plot | Allocation method | Random-effects .......................................................................... 119 Figure 81 | Plot | Year | Random-effects ................................................................................................. 120 Figure 82 | Plot | Latitude | Random-effects ........................................................................................... 121 Figure 83 | Dispersion of effects about two regression lines ................................................................... 123 Figure 84 | Setup ...................................................................................................................................... 125 Figure 85 | Diagnostics ............................................................................................................................. 125 Figure 86 | Covariance matrix .................................................................................................................. 132 Figure 87 | Correlation matrix .................................................................................................................. 133 Figure 88 | Main results | Random-effects .............................................................................................. 134 Figure 89 | Setup | Intercept only ............................................................................................................ 135 Figure 90 | Main results | Intercept only ................................................................................................. 136 Figure 91 | Setup | Intercept + Allocation ................................................................................................ 137 Figure 92 | Main results | Intercept + Allocation ..................................................................................... 137 Figure 93 | Setup | Intercept + Allocation + Year ..................................................................................... 139 Figure 94 | Main results | Intercept + Allocation + Year .......................................................................... 139 Figure 95 | Setup | Intercept + Allocation + Year + Latitude .................................................................... 141 Figure 96 | Main results | Intercept + Allocation + Year + Latitude ......................................................... 141 8 Figure 97 | Main results | Intercept + Allocation + Year + Latitude ......................................................... 143 Figure 98 | Setup ...................................................................................................................................... 145 Figure 99 | Main results | Latitude | Random-effects ............................................................................. 146 Figure 100 | Dispersion of effects about grand mean vs. dispersion of effects about regression line .... 147 Figure 101 | Display R2 .............................................................................................................................. 149 Figure 102 | Schematic for R2 ................................................................................................................... 149 Figure 103 | Setup .................................................................................................................................... 152 Figure 104 | Display increments ............................................................................................................... 153 Figure 105 | Increments ........................................................................................................................... 154 Figure 106 | Main results | Random-effects ............................................................................................ 157 Figure 107 | Setup .................................................................................................................................... 159 Figure 108 | Main results | Random-effects ............................................................................................ 160 Figure 109 | Plot of log risk ratio on Latitude | Random-effects ............................................................. 161 Figure 110 | Plot of log risk ratio on Latitude | Select variable for X-axis ................................................ 162 Figure 111 | Plot of log risk ratio on Latitude | Blank canvas .................................................................. 163 Figure 112 | Plot of log risk ratio on Latitude | Studies ........................................................................... 164 Figure 113 | Plot of log risk ratio on Latitude | Regression line ............................................................... 165 Figure 114 | Plot of log risk ratio on Latitude | Confidence interval ........................................................ 168 Figure 115 | Plot of log risk ratio on Latitude | Prediction interval ......................................................... 169 Figure 116 | Plot of log risk ratio on Latitude | Identify studies .............................................................. 170 Figure 117 | Regression | Setup ............................................................................................................... 173 Figure 118 | Regression | Main results | Random-effects ....................................................................... 174 Figure 119 | Regression | Plot | Categorical covariate ............................................................................ 175 Figure 120 | Regression | Plot | Setting the scale anchors ...................................................................... 176 Figure 121 | Regression | Set statistical options ...................................................................................... 177 Figure 122 | Regression | Setup ............................................................................................................... 180 Figure 123 | Set statistical options | Z-Distribution vs. Knapp-Hartung .................................................. 181 Figure 124 | Main results | Z-Distribution ................................................................................................ 182 Figure 125 | Set statistical options | Z-Distribution vs. Knapp-Hartung .................................................. 183 Figure 126 | Main results | Knapp-Hartung ............................................................................................. 184 Figure 127 | Set statistical options | One-point confidence intervals...................................................... 188 Figure 128 | Set statistical options | Simultaneous confidence intervals ................................................ 188 Figure 129 | Set statistical options | Estimating T2 .................................................................................. 189 Figure 130 | Creating dummy variables ................................................................................................... 193 Figure 131 | Creating dummy variables ................................................................................................... 194 Figure 132 | Categorical variables ............................................................................................................ 195 Figure 133 | Creating dummy variables ................................................................................................... 195 Figure 134 | Dummy variables | Allocation with “Randomized” as the reference group........................ 197 Figure 135 | Dummy variables | Allocation with “Systematic” as the reference group .......................... 198 Figure 136 | Dummy variables | Allocation with “Alternate” as the reference group ............................ 199 Figure 137 | Subgroups | Allocation type................................................................................................. 203 Figure 138 | Data-entry | Dummy variables for Hot and Cold ................................................................. 206 Figure 139 | Basic analysis | Computing T2 in the presence of subgroups .............................................. 207 Figure 140 | Basic analysis | Subgroups Cold vs. Hot ............................................................................... 207 Figure 141 | Basic analysis | Subgroups Cold vs. Hot ............................................................................... 208 Figure 142 | Regression | Setup | No intercept ....................................................................................... 208 Figure 143 | Regression | Main results | No intercept ............................................................................ 209 Figure 144 | Regression | Main results | Assessing the impact of a set .................................................. 212 9 Figure 145 | Main results | Assessing the impact of a set........................................................................ 213 Figure 146 | Setup | Defining a set of covariates ..................................................................................... 214 Figure 147 | Setup | Naming a set of covariates ...................................................................................... 215 Figure 148 | Setup | Naming a set of covariates ...................................................................................... 215 Figure 149 | Regression | Main results | Working with a set of covariates............................................. 216 Figure 150 | Main results | Removing a set of covariates........................................................................ 216 Figure 151 | Setup | Interaction of two categorical covariates ............................................................... 220 Figure 152 | Main results | Interaction of two categorical covariates ..................................................... 221 Figure 153 | Plot | Interaction of two categorical covariates .................................................................. 223 Figure 154 | Plot | Interaction of two categorical covariates .................................................................. 224 Figure 155 | Plot | Interaction of two categorical covariates .................................................................. 225 Figure 156 | Setup | Interaction of categorical and continuous covariates............................................. 227 Figure 157 | Main results | Interaction of categorical and continuous covariates .................................. 228 Figure 158 | Plot | Interaction of categorical and continuous covariates................................................ 230 Figure 159 | Plot | Interaction of categorical and continuous covariates................................................ 231 Figure 160 | Plot | Interaction of categorical and continuous covariates................................................ 232 Figure 161 | Setup | Interaction of two continuous covariates ............................................................... 234 Figure 162 | Main results | Interaction of two continuous covariates .................................................... 235 Figure 163 | Plot | Interaction of two continuous covariates .................................................................. 236 Figure 164 | Plot | Interaction of two continuous covariates .................................................................. 237 Figure 165 | Plot | Interaction of two continuous covariates .................................................................. 238 Figure 166 | Setup | Curvilinear relationship ........................................................................................... 240 Figure 167 | Main results | Curvilinear relationship ................................................................................ 241 Figure 168 | Plot | Curvilinear relationship .............................................................................................. 242 Figure 169 | Setup .................................................................................................................................... 244 Figure 170 | Data-entry | Missing data for latitude ................................................................................. 245 Figure 171 | Basic analysis | Missing data for latitude ............................................................................. 246 Figure 172 | Regression | Setup | Latitude in list and checked ............................................................... 246 Figure 173 | Regression | Main results | Missing data ............................................................................ 247 Figure 174 | Table of missing data............................................................................................................ 248 Figure 175 | Setup | Latitude in list, unchecked ...................................................................................... 249 Figure 176 | Main results | Latitude in list, unchecked ............................................................................ 250 Figure 177 | Setup | Latitude must be removed from list........................................................................ 251 Figure 178 | Setup | Latitude removed from list...................................................................................... 252 Figure 179 | Main results | Latitude removed from list ........................................................................... 252 Figure 180 | Data entry............................................................................................................................. 254 Figure 181 | Basic analysis ........................................................................................................................ 254 Figure 182 | Meta-regression ................................................................................................................... 255 Figure 183 | Select by study name ........................................................................................................... 255 Figure 184 | Select by study name ........................................................................................................... 256 Figure 185 | Create a moderator for filtering ........................................................................................... 257 Figure 186 | Filter by moderator .............................................................................................................. 258 Figure 187 | Regression using a filter ....................................................................................................... 259 Figure 188 | Select by moderator ............................................................................................................. 260 Figure 189 | Select by moderator ............................................................................................................. 261 Figure 190 | Filter by moderator .............................................................................................................. 261 Figure 191 | Defining several models | Setup .......................................................................................... 262 Figure 192 | Defining several models | Setup .......................................................................................... 262 10 Figure 193 | Defining several models | Main-analysis | Intercept........................................................... 263 Figure 194 | Defining several models | Main-analysis | Intercept + year ................................................ 264 Figure 195 | Defining several models | Main-analysis | Intercept + year + latitude ............................... 265 Figure 196 | Defining several models | Main-analysis | Intercept + year + latitude ............................... 266 Figure 197 | Defining several models | Main-analysis | Intercept + year + latitude ............................... 267 Figure 198 | Defining several models | Setup .......................................................................................... 268 Figure 199 | Defining several models | Main-analysis | Intercept + year + latitude ............................... 269 Figure 200 | Defining several models | Setup| Year or Latitude ............................................................. 269 Figure 201 | Multiple predictive models | Plot based on Year ................................................................ 270 Figure 202 | Multiple predictive models | Plot based on Year + Latitude ............................................... 271 Figure 203 | Data-entry | Complex data-structures ................................................................................. 273 Figure 204 | Data-entry | Complex data-structures ................................................................................. 274 Figure 205 | Basic analysis | Subgroup within-study as unit of analysis .................................................. 274 Figure 206 | Basic analysis | Subgroup within-study as unit of analysis .................................................. 275 Figure 207 | Regression | Subgroup within-study as unit of analysis ...................................................... 276 Figure 208 | Regression | Subgroup within-study as unit of analysis ...................................................... 277 Figure 209 | Basic analysis | Study as unit of analysis.............................................................................. 278 Figure 210 | Basic analysis | Study as unit of analysis.............................................................................. 278 Figure 211 | Regression | Study as unit of analysis .................................................................................. 279 Figure 212 | Regression | Study as unit of analysis .................................................................................. 280 Figure 213 | Data-entry | Multiple outcomes .......................................................................................... 282 Figure 214 | Data-entry | Multiple outcomes .......................................................................................... 282 Figure 215 | Basic analysis | Multiple outcomes | Select one outcome .................................................. 283 Figure 216 | Basic analysis | Multiple outcomes | Use all outcomes, assuming independence ............. 283 Figure 217 | Basic analysis | Multiple outcomes | Use all outcomes, assuming independence ............. 284 Figure 218 | Multiple outcomes | Setup .................................................................................................. 284 Figure 219 | Multiple outcomes | Use all outcomes, assuming independence ....................................... 285 Figure 220 | Multiple outcomes | Use all outcomes, assuming independence ....................................... 286 Figure 221 | Basic analysis | Multiple outcomes | Use mean of outcomes ............................................. 287 Figure 222 | Basic analysis | Multiple outcomes | Use mean of outcomes ............................................. 287 Figure 223 | Multiple outcomes | Use mean of outcomes ...................................................................... 289 Figure 224 | Multiple outcomes | Use mean of outcomes ...................................................................... 290 Figure 225 | BCG Data in Excel™............................................................................................................... 298 Figure 226 | BCG Data in Excel™............................................................................................................... 299 Figure 227 | Flowchart showing how T2 and I2 are derived from Q ......................................................... 304 Figure 228 | Case-A | Main results | Fixed-effect weights....................................................................... 305 Figure 229 | Case-A | Computing Q.......................................................................................................... 306 Figure 230 | Case-A | Main results |Random-effects .............................................................................. 307 Figure 231 | Case-A | Dispersion of effects about regression line ........................................................... 308 Figure 232 | Case-B | Main results | Fixed-effect weights ....................................................................... 309 Figure 233 | Case-B | Computing Q .......................................................................................................... 310 Figure 234 | Case-B | Main results |Random-effects............................................................................... 311 Figure 235 | Case-B | Dispersion of effects about regression line ........................................................... 312 Figure 236 | Case-C | Main results | Fixed-effect weights ....................................................................... 313 Figure 237 | Case-C | Computing Q .......................................................................................................... 314 Figure 238 | Case-C | Main results |Random-effects............................................................................... 315 Figure 239 | Case-C | Dispersion of effects about regression line ........................................................... 316 Figure 240 | Heterogeneity statistics in basic analysis ............................................................................. 318 11 Figure 241 | Heterogeneity statistics in regression .................................................................................. 319 Figure 242 | Heterogeneity statistics in regression .................................................................................. 320 Figure 243 | Heterogeneity statistics with subgroups.............................................................................. 321 Figure 244 | Heterogeneity statistics with subgroups.............................................................................. 322 Figure 245 | Heterogeneity statistics with subgroups.............................................................................. 323 Figure 246 | Heterogeneity statistics with continuous covariate ............................................................ 324 Figure 247 | Heterogeneity statistics with continuous covariate ............................................................ 325 Figure 248 | Computing τ2 in the presence of subgroups ........................................................................ 327 Figure 249 | Computing τ2 in the presence of subgroups ........................................................................ 328 Figure 250 | Computing τ2 in the presence of subgroups ........................................................................ 328 Figure 251 | Computing τ2 in the presence of subgroups ........................................................................ 329 Figure 252 | Computing τ2 in the presence of subgroups ........................................................................ 330 Figure 253 | Creating variables for interactions ....................................................................................... 331 Figure 254 | Creating variables for interactions ....................................................................................... 332 Figure 255 | Creating variables for interactions ....................................................................................... 332 Figure 256 | Creating variables for interactions ....................................................................................... 333 Figure 257 | Creating variables for interactions ....................................................................................... 333 Figure 258 | Plotting a curvilinear relationship ........................................................................................ 334 Figure 259 | Plotting a curvilinear relationship ........................................................................................ 335 Figure 260 | Plotting a curvilinear relationship ........................................................................................ 336 Figure 261 | Plotting a curvilinear relationship ........................................................................................ 336 Figure 262 | Plotting a curvilinear relationship ........................................................................................ 337 Figure 263 | Plotting interaction of two categorical covariates ............................................................... 339 Figure 264 | Plotting interaction of two categorical covariates ............................................................... 340 Figure 265 | Plotting interaction of two categorical covariates ............................................................... 340 Figure 266 | Plotting interaction of two categorical covariates ............................................................... 342 Figure 267 | Plotting interaction of categorical by continuous covariates .............................................. 343 Figure 268 | Plotting interaction of categorical by continuous covariates .............................................. 344 Figure 269 | Plotting interaction of categorical by continuous covariates .............................................. 346 Figure 270 | Plotting interaction of continuous covariates ...................................................................... 348 Figure 271 | Plotting interaction of continuous covariates ...................................................................... 348 Figure 272 | Plotting interaction of continuous covariates ...................................................................... 351 Figure 273 | CMA | Intercept + Year + Latitude + Allocation | Z | Method of moments ........................ 354 Figure 274 | Metareg| Intercept + Year + Latitude + Allocation | Z | Method of moments ................... 354 Figure 275 | CMA | Allocation | Z | Method of moments ....................................................................... 355 Figure 276 | Metareg | Allocation | Z | Method of moments ................................................................. 355 Figure 277 | CMA | Allocation, Year | Z | Method of moments .............................................................. 356 Figure 278 | Metareg | Allocation, Year | Z | Method of moments ........................................................ 356 Figure 279 | CMA | Intercept, Year-C, Year-C2 | Z | Method of moments .............................................. 357 Figure 280 | Metareg | Intercept, Year-C, Year-C2 | Z | Method of moments........................................ 357 12 PART 1: DATA FILES AND DOWNLOADS This manual CMA program http://www.meta-analysis.com/pages/cma_manual.php http://www.meta-analysis.com/ BCG data in CMA format File using period for decimals File using comma for decimals http://www.meta-analysis.com/downloads/BCG P.cma http://www.meta-analysis.com/downloads/BCG C.cma BCG data in Excel™ format http://www.meta-analysis.com/downloads/BCG.xls Excel™ files for plotting interactions http://www.meta-analysis.com/downloads/Plot-interaction-of-Hot-by-Time.xls http://www.meta-analysis.com/downloads/Plot-interaction-of-Hot-x-Year-C.xls http://www.meta-analysis.com/downloads/Plot-interaction-of-Latitide-x-Year-C.xls http://www.meta-analysis.com/downloads/Plot-of-curvilinear-relationship.xls 13 PART 2: OVERVIEW OF META-REGRESSION INTRODUCTION In primary studies, multiple-regression is the statistical technique employed to assess the relationship between covariates and a dependent variable. In these studies the unit of analysis is the subject, with covariates and the outcome measured for each subject. With a few modifications, the same technique can be used in meta-analysis. In this case, the unit of analysis is the study, with covariates and outcomes measured for each study. We sometimes use the term “meta-regression” to refer to the use of regression in meta-analysis. With these modifications in place, the full arsenal of procedures that fall under the heading of “multipleregression” in primary studies becomes available to the meta-analyst. For example, • • • • We can assess the impact of one covariate, or the combined impact of multiple covariates We can enter covariates into the analysis using a pre-defined sequence and assess the impact of any covariates, over and above the impact of prior covariates We can work with sets of covariates, such as three variables that together define a treatment, or that represent a nonlinear relationship between the predictor variable and the effect size. We can incorporate both categorical (for example, dummy-coded) and continuous variables as covariates. This book is intended as a resource to explain how to run and interpret a meta-regression. It is also intended as a manual to show how to use the program CMA to perform a meta-regression. In Part 1 we provide links to the data files referenced in this manual In Part 2 we provide an overview of this manual In Part 3 we introduce the BCG example In Part 4 we explain that meta-regression is an observational analysis In Part 5 we provide an overview of fixed-effect vs. random-effects models In Part 6 we provide a step-by-step guide for running a meta-regression in CMA In Part 7 we discuss how to interpret the results of a meta-regression In Part 8 we discuss the computation and meaning of R2 In Part 9 we discuss how to use and customize the regression plot In Part 10 we discuss computational options In Part 11 we explain how to work with categorical variables In Part 12 we discuss when it makes sense to omit the intercept In Part 13 we discuss how to work with “sets” of covariates In Part 14 we explain how to work with interactions In Part 15 we discuss missing data In Part 16 we show how to run a regression on subsets of the studies In Part 17 we show how to define and compare different predictive models In Part 18 we explain how to work with complex data structures In Part 19 we discuss some caveats about meta-regression In Part 20 we provide a technical appendix 14 PART 3: THE BCG EXAMPLE We will use the “BCG analysis” as the motivating example in this book. “BCG” refers to the Bacillus Calmette-Guerin (BCG) vaccine, which is intended to prevent tuberculosis (TB). This vaccine had been studied in a series of 13 controlled trials between the years 1933 and 1968, with some trials suggesting that the vaccine was effective in reducing the incidence of TB, and others suggesting that it was not. With the re-emergence of TB in the United States in recent years (including many drug-resistant cases) the question of whether or not BCG was actually effective took on a new urgency. Colditz et al. (1994) conducted a meta-analysis to synthesize the data from these trials. Figure 1 shows a random-effects meta-analysis based on the BCG studies. The effect size is the risk ratio, as indicated by the labels [A]. In this example a risk ratio of less than 1.0 indicates that the vaccine reduced the risk of TB, a risk ratio of 1.0 indicates no effect, and a risk ratio higher than 1.0 indicates that the vaccine increased the risk of TB. The summary risk ratio [B] is 0.4896 with a 95% confidence interval of 0.3449 to 0.6950, and a p-value of 0.0001. Thus, there is strong evidence that the vaccine is effective in preventing TB. A D C Figure 1 | Basic analysis | Random effects | Risk ratio B Equally important, however, is the variation in the treatment effect, with the risk ratio in individual studies ranging from 0.2049 (approximately an 80% risk reduction) in one study [C] to 1.5619 (approximately a 56% risk increase) in another [D]. While some of the observed variance in effects is probably due to sampling error, a substantial amount of the variance reflects real differences in the treatment effect (more on this later). Obviously, it’s imperative to understand why the vaccine is more effective in some studies than in others. 15 Among the studies in the meta-analysis, there appears to be a relationship between climate and effectiveness, such that studies performed in colder locations tended to show a stronger effect. If this relationship is real, it could be explained by either of two mechanisms. First, persons in colder climates may be less likely to have a natural immunity to TB. It follows that the population in these climates would be more susceptible to TB, and more likely to benefit from the vaccine. Second, it’s likely that the drug would be more potent in the colder climates. This follows from the fact that in warmer climates the heat could cause the drug to lose potency. Optimally, a researcher would be able to code each study for the prevalence of natural immunity and for drug potency, and then use these as predictors of effectiveness in a meta-regression. However, these predictors were not available for the analysis. Therefore, the analysts elected to use “Latitude” (actually, the absolute value of latitude) as a surrogate for these covariates, the assumption being that studies more distant from the equator tended to employ populations with less natural immunity and more potent vaccine. This was the strategy adopted by Berkey et al. (1995), who employed meta-regression to assess the relationship between latitude and treatment effect. Given the post hoc nature of this analysis, a positive finding would not be definitive, but would suggest a direction for additional research. This regression has been used as an example in many texts on meta-analysis, including Borenstein et al. 2009, Egger et al. 2001, Sutton et al 2000, Hartung et al. (2008)). We will use this example here as well, to allow readers to compare our analysis with the analyses presented in the other texts. Note that each text presents the data in a somewhat different format, and here we follow the format employed by Hartung. In addition to the original variables, we created new variables for the purposes of this text. For example, we classified studies as “Hot” or “Cold” based on the latitude, created variables that are centered versions of the originals, and created variables to represent interactions among the original variables. Part of the dataset is shown in Figure 2. The full set of variables is described in Table 1. In [Appendix 1: The dataset] we show how to create all of the new variables in Excel™. Figure 2 | Data-entry screen 16 Table 1 Variable Type Description Latitude Integer Allocation Categorical Year Integer This is absolute latitude, which is simply the latitude ignoring the sign. Low values are closer to the equator, high values are more distant. This is the method employed to assign people to the treated or control conditions. The three possible classifications are randomized, alternate, and systematic. This is the year the study was conducted. Original variables Variables related to climate Latitude-C Latitude-C2 Climate Decimal Decimal Categorical Hot Cold Integer Integer Latitude, centered to have a mean of zero The square of Latitude-C We dichotomized latitude, and classified each study’s location as Hot (latitude under 34) or Cold (latitude over 41) This is a numeric version of Climate, coded 1 for Hot and 0 for Cold. This is a numeric version of Climate, coded 1 for Cold and 0 for Hot. Variables related to study year Year-C Time Categorical Early Recent Integer Integer Year, centered to have a mean of zero We dichotomized Year, and classified each study’s time as Early (pre1945) or Recent (post-1945) This is a numeric version of Time, coded 1 for Hot and 0 for Cold. This is a numeric version of Time, coded 1 for Cold and 0 for Hot. Decimal Integer Decimal Interaction of Hot and Year-C (Dichotomous and continuous) Interaction of Hot and Recent (Dichotomous and Dichotomous) Interaction of Latitude-C and Year-C (Continuous and Continuous) Interactions Hot x Year-C Hot x Recent Latitude-C x Year-C The letter “C” added to a variable indicates that the variable has been centered (See appendix) Types of variables Categorical Integer Decimal Studies belong to discrete groups, such as “Cold” and “Hot”. This cannot be used directly in the analysis, but is used to create numeric variables (dummy-variables). Studies have values on an integer scale, and can take on any whole number. Studies have values on a continuous scale and can take on any whole or decimal number. 17 In Figure 1 the treatment effect (or effect size) was displayed as a risk ratio. While the risk ratio has the advantage of being an intuitive index, the analyses are actually performed using the log of the risk ratio, and then converted to risk ratios for display. Since one of our goals in this book is to explain the mechanics of the analyses, we will generally be working with the log units. For example, where Figure 1 showed the risk ratios, Figure 3 shows the same forest plot using log units, as indicated by the labels [E]. In this case, a log risk ratio of less than 0.0 indicates that the vaccine reduced the risk of TB, a log risk ratio of 0.0 indicates no effect, and a log risk ratio greater than 0.0 indicates that the vaccine increased the risk of TB. The summary effect [F] is a log risk ratio of −0.7141. E Figure 3 | Basic analysis | Random effects | Log risk ratio F 18 PART 4: META-REGRESSION IS OBSERVATIONAL When we work with primary studies we need to be aware of the difference between a randomized study and an observational study. In a randomized trial, participants are assigned at random to a condition, such as treatment versus placebo. The randomization is intended to ensure that the subjects in the two conditions are similar in all respects except for the treatment. Therefore, assuming that the randomization works properly, any differences that emerge between groups can be attributed to the treatment. By contrast, in an observational study we compare pre-existing groups, such as workers with a college education versus those who did not attend college. While we can report on differences between groups, we cannot attribute these differences to the presence or absence of a college education because the groups differ in other ways as well. For example, it is possible (indeed likely) that subjects who had a college education also had other advantages, including better skills in an array of areas. Consider how this plays out in a meta-analysis. Assume we start with a set of randomized experiments that assess the impact of an intervention. If the effect in each study can serve to establish causality because of the randomization process, then the summary effect can also serve to establish causality. If the effect in each study is due to the intervention, then the overall effect is due to the intervention. However, even if the individual studies are randomized trials, once we move beyond the goal of reporting a summary effect and proceed to perform a subgroup analyses or meta-regression, we have moved out of the domain of randomized experiments, and into the domain of observational studies. In this example, if the effect size is different in hot climates than in cold climates we cannot assume that this is because of climate. While we choose to label one group of studies “Hot climate” and another “Cold climate”, it is possible that the two groups differ from each other in other ways as well, and that it might be these other factors (instead of climate, or in addition to climate) that are responsible for the difference in effects. For example, it turns out that the “Hot” studies, where the vaccine was less effective, tend to be more recent. We’d like to think that the vaccine was less effective because of storage conditions in the hot climates, and that if we used better storage the efficacy would improve. However, it’s also possible that there were unrelated changes over the decades that caused the vaccine to become less effective. That said, in primary observational studies, researchers sometimes use regression analysis to try and remove the impact of potential confounders. This is not a perfect solution since there may be other confounders of which we are not aware, but this approach can help to isolate the impact of specific factors and generate hypotheses to be tested in randomized trials. The same holds true for metaregression. 19 This approach is potentially useful only when there are enough studies to isolate the unique impact of each factor. Many meta-regressions are based on relatively small numbers of studies, and so it may not be possible to adjust for potential confounds. Of course, even when there are enough studies to adjust for known confounds, we cannot be certain that we’ve identified all possible confounds. Therefore we can’t use this approach to prove a causal relationship. There is one exception to the rule that subgroup analysis and regression cannot prove causality. This is the case where not only was assignment to treatment condition randomized, but also assignment to subgroup was randomized. In this case we know that the only systematic difference between subgroups is the one captured by subgroup membership. The pharmaceutical example discussed later is a case in point. In this hypothetical example we enrolled 1000 patients and assigned some to studies that would test a low dose of the drug vs. placebo, and others to studies that would test a high dose of the drug vs. placebo. Here, the assignment to subgroups is random. The same would apply if the patients were assigned to ten studies where the dose of drug was varied on a continuous scale, and we used meta-regression to test the relationship between dose and effect size. This set of circumstance (the drug company example) is extremely rare in practice. We present it here primarily to illustrate the conditions that would be needed before we could draw a causal inference from a subgroup analysis or regression. Our main point is that in the absence of these conditions we cannot draw a causal inference. 20 PART 5: FIXED-EFFECT VS. RANDOM-EFFECTS Statistical models Before turning to regression, we briefly review the statistical models typically employed in meta-analysis – the fixed-effect model and the random-effects model. To understand the difference between these two models it might help to distinguish between the ideas of a population versus a universe. One study samples subjects from a population, which is defined as people meeting a specific set of criteria. A second study samples subjects from a population, which is defined as people meeting a specific set of criteria. If the two sets of criteria are the same in all relevant respects, then we can say that both studies are sampling from the same population. Equivalently, if the true effect size (the effect size that we would see if there was no sampling error) is identical in both populations, then we can think of them as the same population. However, if the criteria differ in any material respect, then we would say that the two studies sample from different populations. In this case each study has its criteria that define the study’s population, and we need a second set of criteria that tell us what kinds of studies (what populations) we want to include in the meta-analysis. This second set of criteria defines the universe of populations. For example, suppose that each study samples persons from the lung-cancer outpatient clinic at the hospital where the study is being conducted. • If there are ten studies in the analysis and all were conducted at the same hospital (let’s also assume at the same time) then all studies are being sampled from the same population. The true effect size in all the studies is the same. • If there are ten studies in the analysis and each was conducted at a different hospital, then each study is sampled from a different population. The true effect size probably differs (possibly by a little, or possibly by a lot) from one hospital to the next. In the second case (ten hospitals) if we decide to include all these hospitals in the meta-analysis, it’s because all the populations are from the same universe. We would probably define the universe as clinics where the patients are similar enough so that the studies are all addressing the same fundamental question. To this point we’ve focused on the patients in discussing the difference between a population and a universe, but the distinction between the two depends also on other aspects of the study. For example, studies may be from the same population if they all run for exactly two weeks. If some studies run for two weeks and others for three weeks, then the two kinds of studies are based on different populations but are drawn from the same universe. Similarly, studies may be from the same population if they all use the identical measure of outcome. If some studies use one measure while others use a similar measure, then they are based on different populations but are drawn from the same universe. 21 It should be clear that we are defining a population very narrowly. Under this definition studies are only drawn from the same population if they are essentially replicates of each other in all material respects. This means that not only the subjects but also the methods, the specifics of the intervention, and the outcome measures are, for all intents and purposes, identical across studies. This criterion will rarely be met in practice, and will never be met when we’re working with studies conducted independently of each other. With this as background we can discuss the difference between the fixed-effect and the random-effects models. • The fixed-effect model applies if all studies are drawn from a single population (the identical subjects and methods). The studies share a common effect size, and so the effect size is fixed, or constant. • The random-effects model applies if the studies are drawn from a universe of populations. The true effect size varies from one population to the next, and the studies are sampled at random from this universe. The selection of a model is critically important for several reasons. First, it sets up a framework for the analysis, establishing what questions we can ask and how to interpret the results. • Under the fixed-effect model we assume that all studies share a common true effect size, and our goal is to estimate this common parameter. • Under the random-effects model we allow that the true effect size might vary from study to study, and our goal is to estimate the mean of these parameters. Second, the selection of a model affects how weights are assigned to the studies. This affects both the summary estimates themselves and also the precision of the summary estimates. • Under the fixed-effect model there is only one level of sampling (the subjects in each study are sampled from all subjects in the study’s population) and therefore only one source of sampling error (the observed effect in each study differs from the true effect for that study’s population). The error variance for each study is quantified as V, and the weight assigned to each study is the inverse of this variance, or 1/V. • Under the random-effects model there are two levels of sampling (the subjects in each study are sampled from all subjects in the study’s population, and the study populations are sampled from the universe of study populations) and therefore two sources of sampling error (the observed effect in each study differs from the true effect for that study’s population, and the mean true effect for the sampled studies differs from the mean for the universe of studies). The first error variance is quantified as V and the second as T2. The total error variance for each study is then V+T2, and the weight assigned to each study is the inverse of this variance, or 1/(V+T2) 22 We typically encounter the idea of fixed-effect vs. random-effects models in the context of a simple analysis, where we have a single set of studies. In this case, the distinction between the two models is relatively straightforward: the fixed-effect model applies if the studies share a common effect size, and the random-effects model applies otherwise. However, the same idea can be extended to the case where we have discrete subgroups of studies, and even further, to the case where we have studies that range along a continuum on some dimension(s). Our goal here is to explain these extensions, and for that purpose we will show how the difference between fixed-effect and random-effects models plays out in (A) a simple analysis, (B) a subgroups analysis, and (C) a meta-regression. Case A - A simple analysis Consider the case where we are working with a single set of studies. The fixed-effect model applies if all the studies share a common effect size. The random-effects model applies if the effect size might vary from study to study, across all studies in the analysis. Case B - Subgroups Consider the case where we want to compare the effect size for two or more subgroups of studies (for example, studies in cold climates vs. studies in hot climates). The fixed-effect model applies if all studies within a subgroup share the same effect size parameter. The random-effects model applies if the effect size parameters might vary from study to study, for studies within a subgroup. Case C – Regression Consider the case where we want to look at effect size in relation to a continuous covariate (for example, latitude). The fixed-effect model applies if all studies at the same latitude share the same effect size parameter. The random-effects model applies if the effect size parameters vary from study to study, for studies at the same latitude. While we have presented these three cases as being distinct from each other, the fact is that they can all be subsumed under the same general principle. Specifically, • • We use the fixed-effect model when all studies with the same predicted value have the same true effect size. We use the random-effects model when studies with the same predicted value have different true effect sizes. Thus, • • • In Case A the frame of reference is all studies, and the predicted value for any study is the mean of all studies. In Case B the frame of reference is all studies within the subgroup, and the predicted value for any study is the corresponding subgroup mean. In Case C the frame of reference is all studies at the same latitude, and the predicted value for any study is the predicted value given by the regression equation. 23 If the predicted value for all studies within the frame of reference is the same, then the only source of variance is within-study variance (V) and so the fixed-effect model applies. Otherwise we need to account also for between-study variance (T2) and so the random-effects model applies. Here, we present three examples to show how this idea applies to the three cases outlined above. Case A - A simple analysis Case A1. A pharmaceutical company draws a sample of 1000 patients for a randomized controlled trial, but cannot work with all at the patients at one time because of limited space. Therefore, the patients are randomly assigned to one of ten cohorts, and each cohort starts treatment in a different week. If we assume no training effect and no seasonal effect, and ensure that all procedures are identical from one cohort to the next, it follows that the treatment effect should be the same for all cohorts. If we treat each cohort as a separate study and use meta-analysis to synthesize the results, then the fixed-effect model would apply. Case A2. Ten universities form a consortium to run studies with the identical protocol (one at each university) and then use meta-analysis to synthesize the results. This is similar to Case A1, but the assumption that the effect size will be the same in all studies is more tenuous. It might be possible to ensure that the intervention is identical at all the universities, but it might not. It might be possible to ensure that the subjects are identical in all relevant respects at all the universities, but it might not. Therefore, it’s possible that the fixed-effect model would apply, but it might not. In this case it would probably be a good idea to use the random-effects model. If it turns out empirically that the effect sizes do differ, then the random-effects model will have been the correct choice. If it turns out empirically that the effect sizes do not differ, the random-effects weights will be identical to the fixed-effect weights, and so there is no price to pay for having chosen the random-effects model. Case A3. The vast majority of simple meta-analyses are not similar to Case A1 nor to Case A2. Rather, they involve studies planned and performed by different teams of researchers without prior coordination. For example, we may locate 10 studies in the literature that seem to address the same fundamental question. We may decide that the studies are similar enough that it makes sense to perform a synthesis (for example, they all tested the same intervention) but there is no reason to assume that the true effect size will be identical in all studies. Rather, it is likely that the impact of the intervention will be affected (at least a little) by details of the sample (age, history), of the intervention itself (dose, duration), and/or the outcome measure (one test or another). If the studies are fundamentally the same (in the sense that they address the same basic question) then meta-analysis may enable us to identify the core impact of the intervention, cutting through the noise created by these differences. However, to identify this impact properly we need to take account of this noise by using random-effects weights. In this case, the random-effects model is a more plausible fit for the data. Case B - Subgroups The same idea is easily extended to the case of an analysis where we want to compute the summary effect size for two subgroups of studies, and then compare the two. Case B1. In Case A1 (above), the pharmaceutical company randomly assigned patients to one of ten identical cohorts to test Drug-A vs. Control. Suppose that the next year the company did exactly the same thing with a new sample of patients, to test Drug-B vs. Control. The Drug-A studies are Subgroup24 A and the Drug-B studies are Subgroup-B. There is no expectation that the effect size will be the same in the two subgroups. Indeed, we may expect that the effect size will be greater in one subgroup than the other. However, we do expect all studies in the first group to have the same effect size as each other, and all studies within the second group to have the same effect size as each other. The fixed-effect model applies here. Case B2. In Case A2 (above), the consortium planned to run ten identical studies (one at each university). They hoped that studies would be identical in all ways that could impact on the effect size, but could not ensure that this would be the case. Suppose that the next year they continued the arrangement, and ran ten more studies, this time using a variant of the intervention. The same logic applies as in Case A2, the only difference being that the logic now applies within subgroups. While it’s possible that all studies within a subgroup share the same effect size, it’s also possible that the effect size varies. Therefore, it’s probably safer to apply the random-effects model to compute the summary effect size within subgroups. Case B3. In Case A3 (above), researchers located 10 published studies that were similar, but not identical. Logic dictates that the effect size will vary from study to study, and therefore the randomeffects model is appropriate. The same example can be extended to subgroups. Suppose researchers located 10 studies that assessed the impact of Drug-A vs. Control (Subgroup-A), and another 10 that assessed the impact of Drug-B vs. control (Subgroup-B). We may expect that the effect sizes within either subgroup to be similar to each other, but we have no reason to expect that they will be identical to each other. Therefore, the random-effects model is a more plausible fit for the data within subgroups. Case C – Regression Finally, the same idea can be extended to the case where we use a continuous covariate (or a set of covariates) to predict effect size. Case C1. In Case A1 (above), the pharmaceutical company randomly assigned patients to one of ten identical cohorts to test Drug-A vs. Control. We can extend Case A1 to a situation where the company runs ten identical cohorts, and assume that it repeats this process five times, each time with a different dose of the drug. The meta-analysis looks at the relationship of dose with effect size. For all studies at the same dose, the effect size should be the same. The fixed-effect model makes sense here. Case C2. In Case A2 (above), the consortium planned to run ten identical studies (one at each university). We can extend Case A2 to a situation where the consortium runs ten studies based on the same protocol, and assume that it repeats this process five times, each time with a longer intervention. The meta-analysis looks at the relationship of the intervention’s duration with effect size. For all studies of the same duration, the effect size might be the same, but might not. The random-effects model is probably the better choice. Case C3. In Case A3 (above), when studies are performed using different protocols, logic dictates that the effect size will vary from study to study. We can extend Case A3 to a situation where we locate all studies that assessed the impact of an intervention, and then code them based on the dosage. While studies with a similar dose may tend to have similar effect sizes, logic dictates that the effect size for any given dose will still vary. The random-effects model is a more plausible fit for the data. 25 How the model affects the analysis The selection of a statistical model must be based on the sampling frame (as outlined above) and not the fact that one model will yield a more desirable estimate of the effect size or its precision (as outlined below). That said, it’s helpful to understand how the selection of one model or the other affects the estimates of effect size and precision. Again, it’s easiest to explain this for the simple case, and then extend the example to the case of subgroups and regression. Recall that V represents the within-study variance (the variance of the observed effects about the true effect for that study’s population), while T2 represents the between-study variance (the variance of true effects about the mean true effects for all studies in the universe of populations). The basic idea is that uncertainty (and therefore weights) for the fixed-effect model weights are based on V, whereas uncertainty (and therefore weights) for the random-effects model weights are based on V + T2. The only thing that changes as we move from Case A to Case B to Case C is the frame of reference for estimating T2, as follows. • • • When we are working with a single population, T2 reflects the dispersion of true effects across all studies, and is therefore computed for the full set of studies. When we are working with subgroups, T2 reflects the dispersion of true effects within a subgroup, and is therefore computed within subgroups. When we are working with regression, T2 reflects the dispersion of true effects for studies with the same predicted value (that is, the same value on the covariates) and is therefore computed for each point on the prediction slope. As a practical matter, of course, most points on the slope have only a single study, and so this computation is less transparent than that for the single population (or subgroups) but the concept is the same. The practical implications of using a random-effects model rather than a fixed-effects model are the same for all three cases (simple analysis, subgroups, and regression). To wit, • • • The random-effects model will lead to more moderate weights being assigned to each study. As compared with a fixed-effect model, the random-effects model will assign more weight to small studies and less weight to large studies. Under the random-effects model, the confidence interval about each coefficient (and slope) will be wider than it would be under the fixed-effect model. Under the random-effects model, the p-values corresponding to each coefficient and to the model as a whole are less likely (on average) to meet the criterion for statistical significance. 26 These points are evident in Figure 4 and Figure 5, which show the regression of log risk ratio on latitude using fixed-effect and random-effects weights, respectively. Figure 4 | Regression of log risk ratio on latitude | Fixed-effect Figure 5 | Regression of log risk ratio on latitude | Random-effects 27 Relative weights Under the fixed-effect model (Figure 4) the study weights tend to be more extreme, with large studies getting substantially more weight than small studies. Under the random-effects model (Figure 5), the study weights tend to be more moderate, with relatively small differences between the studies. The selection of a model will have a substantial impact on the regression line if studies that fall outside the pattern happen to be especially large. These studies will tend to pull the regression line towards themselves, and this will be more important under fixed effect weights than under random-effect weights. (Conversely, a small study will have more impact on the regression line when we apply randomeffects weights). In the BCG example the larger studies tend to fall within the pattern of the others, and so the regression lines are similar in the two plots. Absolute weights Under the fixed-effect model there is only one source of sampling variance (within-study). Therefore the weights (which are the inverse of the variance) are relatively large, which yields a relatively narrow confidence interval (Figure 4). Under the random-effects model there is an additional source of sampling variance (between-study). Therefore the weights are smaller, which yields a wider confidence interval (Figure 5). 28 PUTTING REGRESSION IN CONTEXT In primary studies, we sometimes think of analyses as belonging to one of three types. • • • Case A. Simple analysis, where the goal is to estimate the mean Case B. Analysis of variance, where the goal is to estimate the mean effect in two (or more) subgroups of subjects, and then see if (and how) the mean varies by subgroup Case C. Regression, where the goal is to estimate the mean effect for subjects that share the same values on one (or more) covariates, and then see if (and how) the mean varies as a function of the covariate values. In fact, though, regression is a general system that includes all three of these cases. In other words, we can use regression not only in Case C but also in Cases A and B. If we did so, we would get the identical answers using regression that we get using a simple analysis or analysis of variance. The same holds true for meta-analysis. While we tend to employ meta-regression only for Case C, we also have the option to use it for Case A or B. We will take advantage of this fact to help explain metaregression. Specifically, • • • For Case A we will perform an analysis using the traditional approach and then using regression, to show the correspondence between the two. For Case B we will perform a subgroups analysis using the traditional approach and then a regression, to show the correspondence between the two. For Case C, there is no simpler approach, and so we will move directly to the regression. The goals of a fixed-effect analysis are different than the goals of a random-effects analysis, and for that reason we will address each separately. We’ll run through this sequence of cases (A, B, C) using the fixed-effect model, and then for the random-effects model. In the text that follows we focus on conceptual issues. The statistical formulas that underlie these issues are presented in Appendix 2: Understanding Q and Appendix 3: Tests of heterogeneity. 29 FIXED-EFFECT MODEL Basic analysis (Case A) Here, we present a meta-analysis of the BCG studies. This is a basic meta-analysis in the sense that our goal is simply to estimate the mean effect size for the full set of studies. We will perform this exercise (1) using the traditional approach and then (2) using meta-regression, to show the correspondence between the two. The traditional approach Figure 6 and Figure 7 are screen-shots from CMA using the traditional approach to meta-analysis. Note that the effect size is in log units [A]. Each row shows the effect size and confidence interval for one study. The lines marked “Fixed” in Figure 6 [B] and in Figure 7 [C] show the summary (common) effect size as −0.4303, with a 95% confidence interval of −0.5097 to −0.3509. The Z-value for a test of the null is −10.6247 with a corresponding p-value of < 0.0001. A Figure 6 | Basic analysis | Fixed-effect | Log risk ratio B 30 Test of effect size Is the effect size zero? The line labeled “Fixed” in Figure 6 [B] and Figure 7 [C] show the effect size is −0.4303 with a standard error of 0.0405. The Z-value for a test of the null is −10.6247 with a corresponding p-value of < 0.0001. We would conclude that the common effect size is probably not zero. C D Figure 7 | Basic analysis | Fixed-effect | Log risk ratio Test of the statistical model Is the data consistent with the fixed-effect model? In Figure 7, the section labeled Heterogeneity [D] presents statistics that address the heterogeneity in effect size. The Q-value is 153.2330 with 12 degrees freedom and a corresponding p-value under 0.0001. This tells us that the true effect size probably varies across studies, which means that the data are not consistent with the assumptions of the fixed-effect model. The regression approach We can perform the same analysis using meta-regression. Figure 8 shows the screen in CMA where we define the model. Since our goal here is to estimate the common effect size (that is, the intercept) we have included no covariates except for the intercept [E]. The results are shown in Figure 9. E Figure 8 | Regression setup | Intercept only 31 F G H I Figure 9 | Regression | Main results | Fixed-effect | Intercept only Test of effect size Is the effect size zero? Since there are no covariates, the predicted effect is simply the intercept, and so the question “Is the effect size zero?” is addressed by a test of the intercept. In Figure 9, the regression equation [F] gives the predicted effect size for all studies as Y=−0.4303 with a standard error of 0.0405, variance of 0.0016, and confidence interval of −0.5097 to −0.3509. The Zvalue for a test of the null is −10.6247 with a corresponding p-value of < 0.0001. We would conclude that the effect size is probably not zero. Note that these numbers match the numbers from the traditional analysis in Figure 6 [B] and Figure 7 [C]. Analysis of variance In Figure 9, results for the fixed-effect regression are presented using an analysis of variance framework, where the total [I] weighted sum of squares (WSS, or Q) is partitioned into the part explained by the predictive model [G] and the residual [H]. Model The line labeled “Model” [G] displays the WSS explained by the predictive model. Since there are no covariates in this example, this line has no relevance here. The Q-value and df are displayed as 0.0, and the p-value as 1.0. 32 Residual The line labeled “Residual” [H] displays the WSS not explained by the model and tests the hypothesis that all studies share a common (true) effect size. Since the Q value is 152.2330 with df = 12 and p < 0.0001 we conclude that the true effect size probably varies across studies. Thus, the data are not consistent with the assumptions of the fixed-effect model. Note that the numbers are identical to those in Figure 7 [D]. Total The line labeled “Total” [I] displays the total WSS for the full set of studies (with no predictors). In this case the total WSS is the same as the residual WSS, since both are based on the variance across all 13 studies. The Q value is 152.2330 with df = 12 and p < 0.0001. Again, the numbers are identical to those in Figure 7 [D]. Summary Our goal here was to show the correspondence between a traditional analysis and a regression for a simple analysis. • In a meta-analysis with no covariates we want to estimate (and test) the effect size. This question is addressed by the common effect (−0.4303) in the traditional analysis, and by the intercept (−0.4303) in the regression. In both cases the standard error is 0.0405 and the p-value is < 0.0001, which tells us that the true effect size is probably not zero. • We also want to know if the data are consistent with the fixed-effect model. This question is addressed by the Q-test for heterogeneity in the traditional analysis, and by the Q-test for the residual in the regression. In both cases the Q-value is 152.2330 with df = 11 and p < 0.0001. This tells us that the true effect size probably varies across studies, and so the assumptions of the fixed-effect model have been violated. Subgroups analysis (Case B) Above, we found that the impact of the vaccine varied from study to study. The researchers hypothesized that this variation might be explained by the fact that studies were conducted in various locations, and that the vaccine was more effective in colder climates. To test this hypothesis we can classify each study as being either “Cold” or “Hot” based on its latitude, and then perform an analysis (a) to estimate the effect size in each subgroup of studies, and (b) to compare the effect size for the two subgroups. (This differs from the original analysis, where the researchers used latitude as a continuous covariate rather than creating two groups). The traditional approach In Figure 10 the studies have been divided into subgroups as follows. 33 In Figure 10 the studies have been divided into subgroups. • • The six “Cold” studies are at the top, followed by their common effect, a log risk ratio of −0.9986 [A]. The seven “Hot” studies are at the bottom, followed by their common effect, a log risk ratio of −0.1115 [B]. A B Figure 10 | Subgroups Cold vs. Hot | Fixed-effect The same statistics are also shown in the “Fixed effect” section of Figure 11. The lines labeled [A] and [B] in Figure 11 correspond to the lines labeled [A] and [B] in Figure 10. Is the common effect size zero for each subgroup? For the cold studies this is addressed by Figure 10 [A]. The same numbers are displayed in Figure 11 [A]. The effect size is −0.9986 with a standard error of 0.0676, variance of 0.0046, and confidence interval of −1.1310 to −0.8662. The Z-value for a test of the null is −14.7808 with a corresponding p-value of < 0.0001. We would conclude that the common effect size for studies in cold climates is probably not zero. For the hot studies this is addressed by Figure 10 [B]. The same numbers are displayed in Figure 11 [B]. The effect size is −0.1115 with a standard error of 0.0506, variance of 0.0026, and confidence interval of −0.2107 to −0.0124. The Z-value for a test of the null is −2.2042 with a corresponding p-value of 0.0275. We would conclude that the common effect size for studies in hot climates is probably not zero. 34 C A B D E F Figure 11 | Subgroups Cold vs. Hot | Fixed-effect Analysis of variance In Figure 11, the section labeled Heterogeneity [C] shows how the total variance is partitioned into its component parts. Where analysis of variance in a primary study is based on sums of squares (SS), the subgroups analysis in a meta-analysis is based on weighted sums of squares (WSS, called Q). Still, the basic idea is the same. The total Q can be partitioned into its component parts – the Q explained by the subgroups and the Q within subgroups (and thus unexplained, or residual). Total within The fixed-effect model requires that all studies within the same subgroup share the same true effect size. This assumption is tested by the Q-statistic where Q and it degrees of freedom are computed within subgroups and then summed across subgroups. Here, Q = 41.7894, df = 11, and p < 0.0001. This tells us that effects probably do vary within subgroups, and the fixed-effect model is not valid [D]. Total between The line labeled “Total between” is a test of the predictive model. Here, it addresses the question “Does effect size vary by subgroup?” The Q-value of 110.4436 with df = 1 and p < 0.0001 tells us that it probably does vary by subgroup [E]. Overall The line labeled “Overall” reflects the total dispersion. It addresses the question “Do the effects vary from each other if we ignore the subgroups and compute the variance of all studies about the grand mean?” The Q-value of 152.2330 with df = 12 and p < 0.0001 tells us that they probably do vary [F]. Note that this is the same question we asked in the simple analysis (with no subgroups), and so it follows that the statistics for this section (Overall) are identical to those we saw for the simple analysis in Figure 7 [D]. 35 Note that the variance components are additive. The Q-value within the Hot studies plus the Q-value within the Cold studies yields the total Q-value within subgroups. Then, the Q-value within subgroups plus the Q-value between-subgroups yields the total Q-value. The regression approach We can perform the same analysis as a meta-regression. Figure 12 shows the screen where we define the model. • • The first covariate is the intercept. The second covariate is the variable called Climate: Hot. This covariate will address the question of whether or not the effect size varies by climate. The sub-designation (Hot) follows the convention that variables are named and coded for the presence of an attribute. Since the variable is called “Hot”, Cold studies will be coded 0 while Hot studies will be coded 1. Figure 12 | Regression Cold vs. Hot | Setup 36 Results are displayed in Figure 13 I J K L Figure 13 | Regression Cold vs. Hot | Fixed-effect Analysis of variance In Figure 13 the Analysis of variance [I] shows how the total WSS [L] is partitioned into its component parts – the WSS explained by the Model (here, subgroups) [J] and the Residual WSS (here, within subgroups) [K]. Model The line labeled “Model” [J] asks if the predictive model (climate) explains any of the variance in effect size. Put another way, it asks if the dispersion of effects about the regression line is smaller when the regression line is based on climate rather than based solely on the grand mean. The analysis shows that Q = 110.4436 with df = 1 and p < 0.0001, so we conclude that the predictive model probably explains (at least) some of the variance in effect size. Residual The line labeled “Residual” [K] asks if the data are consistent with the model’s assumption of a common effect size for all studies with the same climate. The Q value is 41.7894 with df = 11 and p < 0.0001. We conclude that the data are not consistent with the assumptions of the fixed-effect model. 37 Total The line labeled “Total” [L] asks if the between-study variance for the full set of studies (with no subgroups) is zero. This analysis is the same whether or not there are subgroups and so, as in the prior analysis, Q = 152.2330, df = 12, p < 0.0001. Note that the variance components are additive. The Q-value for the residual plus the Q-value for the model yields the total Q-value. Prediction equation The prediction equation [I] is −0.9986 + 0.8870 x Climate. Since Climate is coded 0 for Cold and 1 for Hot, the predicted value for Cold studies is Y =−0.9986 + 0 × 0.8870 =−0.9986 , (1.1) while the predicted value for Hot studies is Y =−0.9986 + 1× 0.8870 =−0.1116 . (1.2) Note that these are the same numbers we saw in the subgroups analysis. • • Figure 10 [A] and Figure 11 [A] showed the mean effect size for the Cold studies as −0.9986, which is the same number we see in (1.1). Figure 10 [B] and Figure 11 [B] showed the mean effect size for the Hot studies as −0.1196, which is the same number we see in (1.2). 38 Summary The total Q of each effect size about the grand mean can be partitioned into its component parts – the Q due to the variation in effect size that can be explained by subgroup membership, and the part that cannot. The traditional approach and the regression approach use somewhat different nomenclatures but are identical mathematically and yield precisely the same answers. • • • • The Q-between (in the traditional model) and the Q-model (in the regression) are both 110.4436 with df = 1 and p < 0.0001. Each tells us that effect size probably differs between subgroups. The Q-within (in the traditional model) and the Q-residual (in the regression) are both 41.7894 with df = 11 and p < 0.0001. Each tells us that the assumptions of the fixed-effect model have been violated. The Q-total in each case is 152.2330 with df = 12 and p < 0.0001. Each tells us that effect sizes vary when we ignore subgroups and work with deviations of all studies from the grand mean. The Q-values are additive. Q-between plus Q-within equals Q-total. Continuous covariate (Case C) Immediately above (Case B) we divided the studies into “Hot” or “Cold” climates, which allowed us to perform a subgroups analysis. In the original paper, the researchers worked with the absolute latitude of each study as a continuous covariate. We turn to that analysis now. The traditional approach There is no mechanism to work with a continuous covariate in the traditional framework. The regression approach Figure 14 shows the screen where we define the regression. We will use intercept and Latitude to predict the effect size. The covariate will address the question of whether or not effect size is related to latitude. Figure 15 shows the results of this analysis. Figure 14 | Regression | Latitude | Setup 39 A B C D Figure 15 | Regression | Latitude | Fixed-effect Analysis of variance In the section labeled Analysis of variance [A] the WSS-total is partitioned into its component parts – the WSS explained by latitude (Model) and the WSS-residual. Model The lines labeled “Model” [B] addresses the hypothesis that the predictive model (latitude) explains any of the variance in effect size. Put another way, it asks if the dispersion of effects about the regression line is smaller when the regression line is based on latitude rather than based solely on the grand mean. Since Q = 121.4999 with df = 1 and p < 0.0001, we conclude that the predictive model probably explains (at least) some of the variance in effect size. Residual The line labeled “Residual” [C] addresses the hypothesis that the data are consistent with the model’s assumption of a common effect size for all studies at the same latitude. The Q value is 30.7331 with df = 11 and p = 0.0012. We conclude that the data are not consistent with the assumptions of the fixedeffect model. Rather, the true effect size does vary from study to study, even for studies at the same latitude. 40 Total The line labeled “Total” [D] addresses the hypothesis that the variance for the full set of studies (with no predictors) is zero. This analysis is the same whether or not there are subgroups or covariates and so, as in the prior analyses, Q = 152.2330, df = 12, p < 0.0001. Summary The total Q of each effect size about the grand mean can be partitioned into its component parts – the Q due to the variation in effect size that can be explained by latitude, and the part that cannot. • • • The Q-value for the model is Q = 121.4999 with df = 1 and p < 0.0001, which tells us that effect size is related to latitude. The Q-value for the residual is 30.7331 with df = 11 and p = 0.0012, which tells us that the assumptions of the fixed-effect model have been violated. The Q-value for the total is 152.2330 with df = 12 and p < 0.0001, which tells us that that effect sizes vary when we ignore latitude and work with deviations of all studies from the grand mean. In context We presented three cases to show the correspondence between regression and a traditional analysis. In Case A there were no covariates; in Case B there was a categorical covariate; and in Case C there was one continuous covariate. For the traditional analysis the fixed-effect model requires that all studies share the same true effect size (Case A), or that all studies within a subgroup share the same effect size (Case B). This assumption is tested by the Q value based on deviations from the grand mean (Case A), or by the Q value based on deviations from each study’s subgroup mean (Case B). The regression model is a more general model and allows us to cover all cases by saying that the effect size must be identical for all studies with the same predicted value. In Case A the predicted value is the grand mean (as it was for the traditional analysis). In Case B the predicted value is the subgroup mean (as it was for the traditional analysis). In Case C the predicted value is the point on the regression line corresponding to the regression equation. 41 RANDOM-EFFECTS MODEL Immediately above, we showed how to interpret the regression under the fixed-effect model. Now, we show how to interpret the same regression under the random-effects model. Many of the statistics are different but, more fundamentally, many of the questions addressed by the analysis are different. Basic analysis (Case A) Here, we present a meta-analysis of the BCG studies. This is a basic meta-analysis in the sense that our goal is simply to estimate the mean effect size for the full set of studies. We will perform this exercise (1) using a traditional approach and then (2) using meta-regression, to show the correspondence between the two. The traditional approach Figure 16 is a screen-shot from CMA using the traditional approach to meta-analysis. Note that the effect size is in log units [A]. Each row shows the effect size and confidence interval for one study. The line marked “Random” in Figure 16 [B] shows the summary (mean) effect size as −0.7141, with a 95% confidence interval of −1.064 to −0.3638. The Z-value for a test of the null is −3.9952, and the corresponding p-value is 0.0001. The same numbers are displayed in Figure 17 [C] A A Figure 16 | Basic analysis | Log risk ratio | Random-effects B 42 C D E Figure 17 | Basic analysis | Log risk ratio | Random-effects Test of effect size Is the mean effect size zero? The line labeled “Random” in Figure 16 [B] and Figure 17 [C] shows that the effect size is −0.7141 with a standard error of 0.1787. The Z-value for a test of the null is -3.9952 with a corresponding p-value of less than 0.0001. We would conclude that the mean effect size is probably not zero. Heterogeneity Is there any unexplained variance in the true effect sizes? The mean effect size is −0.7141. Is it possible that all the observed variance about this mean reflects sampling error, or is there evidence that some of this variance reflects differences in the true effect size across studies? This is addressed in Figure 17 by the section labeled “Heterogeneity” [D]. The Q-value is 152.2330 with df = 11 and p < 0.0001. This tells us that it’s unlikely that all of the variance is due to sampling error. We conclude that the true effect size probably does vary from study to study. Note that the heterogeneity section for the random-effects analysis (Figure 17) is identical to the one for the fixed-effect analysis (Figure 7). In both cases the heterogeneity statistics are based on the weights 1/V (that is, where the only sampling error is within-studies). While the numbers are identical for the fixed-effect and the random-effects models, the interpretation differs. Under the fixed-effect model, the presence of heterogeneity in the true effects tells us that the statistical model does not match the data. Under the random-effects model, by contrast, this heterogeneity is employed to estimate T2, which is then incorporated into the weights assigned to each study. How much variance is there? This is addressed in the section labeled “Tau-squared” in Figure 17 [E]. The between-studies variance (T2) is estimated as 0.3088. The between-studies standard deviation T is simply the square root of T2, or 0.5557. What proportion of the observed variance is true variance? 43 Some of the observed variance is due to real differences in effect size, while some reflects sampling error. The I2 statistic [D] reflects the proportion of variance that is due to real differences (and thus potentially explainable by covariates). In this case I2 is 92.1173%, which means that almost all of the observed variance reflects real differences in study effects. The regression approach We can perform the same analysis using meta-regression. Figure 18 shows the screen in CMA where we define the model [F]. We’ve used no covariates except for the intercept. F Figure 18 | Regression | Intercept | Setup 44 Figure 19 shows the results of this analysis. G H I J K Figure 19 | Regression | Intercept | Main results | Random-effects Test of effect size Is the mean effect size zero? Since there are no covariates, the predicted effect is simply the intercept, and so the question “Is the mean effect size zero?” is addressed by a test of the intercept. In Figure 19 the regression equation [G] gives the predicted effect size for all studies as Y=−0.7141 with a standard error of 0.1787 and confidence interval of −1.0644 to −0.3638. The Z-value for a test of the null is -3.9952 with a corresponding p-value of < 0.0001. We would conclude that the mean effect size is probably not zero. Note that these numbers match the numbers in Figure 16[B] and Figure 17 [C]. Test of the model The line labeled Model [H] addresses the hypothesis that the covariates explain any of the variance in effect size. Since there are no covariates in this model, this section is not relevant. The Q-value is shown as 0.0, the df as 0, and the p-value as 1.0. 45 Goodness of fit Is there any unexplained variance in the true effect sizes? The predicted effect size for each study is simply the intercept, −0.7141. Is it possible that all the observed variance about the mean reflects sampling error, or is there evidence that some of this variance reflects differences in the true effect size across studies? This line [I] is called “Goodness of fit” since the presence of true variance means (by definition) that some variance remains unexplained. That is, the prediction model does not “Fit” (fully explain) the variance in effect sizes. The Q-value is 152.2330, with df = 11 and p < 0.0001. We conclude that the true effect size probably does vary from study to study. These statistics are identical to those in Figure 17 [D] for the traditional analysis. How much variance is there? The between-studies variance (T2) is estimated as 0.3088. The between-studies standard deviation (T) is then the square root of T2, or 0.5557. These values correspond to the values in Figure 17 [E]. What proportion of the observed variance is true variance? Some of the observed variance is due to real differences in effect size, while some reflects sampling error. The I2 value [I] reflects the proportion of variance that is due to real differences (and thus potentially explainable by covariates). In this case I2 is 92.1173%, which means that almost all of the observed variance reflects real differences in study effects. This corresponds to the value in Figure 17 [D]. Graphic In Figure 20 we’ve plotted all 13 studies, and we’ve also plotted the regression line [L]. Since there are no covariates in the predictive model, the regression line is horizontal. That is, the predicted value for every one of the studies is the intercept (which is also the mean) of −0.7141. The Q-statistic was computed by working with the deviation of every study from this predicted value. Note that this graphic applies to both the traditional analysis (where the predicted value is the mean) and the regression (where the predicted value is the intercept) since these values are identical. 46 M L N Figure 20 | Dispersion of effects about grand mean In addition to plotting the mean treatment effect [L] we can also plot the distribution of (true) treatment effects about this mean. Concretely, if we assume that these effects are normally distributed with a mean of −0.7141 and a standard deviation (T) of 0.5557, we would expect some 95% of all true effects to fall in the approximate range of −1.8033 to 0.3751 (that is, the mean +/− 1.96 T). This range is represented by the normal curve superimposed on the plot, which is centered at −0.7141 and extends from −1.8033 [N] to 0.3751 [M]. The figure is drawn to scale, and we can see that the curve captures almost all of the effect sizes. Note The curve is intended to capture most of the dispersion in true effects, not in observed effects. As it happens, in this example the observed effects and the true effects are very similar (I2 is 92%, which means that almost all of the observed dispersion is real) and for that reason, most of the effects fall within the curve. However, this will not always be the case. For example, suppose that I2 had been 25%. In this case the true dispersion would have been less, and the curve would have been smaller. While the curve would still capture most of the dispersion in true effects, many of observed effects would fall outside the curve. Comparison of Model 1 with the null model Returning to Figure 19, the goal of section [J] is to estimate R2, the proportion of between-study variance (T2) explained by the model. For this purpose we need two estimates of T2 – with predictors [I] and without predictors [J]. We will then use these two numbers to compute R2 [K]. Total between-study variance The line labeled [J] displays the unexplained between-study variance (T2) when there are no predictors in the model, which is 0.3088. This is the same value displayed in the traditional analysis (Figure 17 [E]). 47 Line [J] gives the original T2 while line [I] gives the residual T2 after we’ve entered all the covariates. In this example, since there are no covariates in the model, the two values are identical. When there are covariates in the model, the estimate of T2 on line [J] may be less than the estimate on line [I], and the difference would be used to compute R2, the proportion of the original variance explained by the covariates. Proportion of variance explained This section of the screen [K] is used to report what proportion of the total variance can be explained by the predictive model. In this example there are no covariates. Therefore, the estimate of T2 on line [J] is the same as the estimate on line [I] and R2 is shown as 0.00. Summary Our goal here was to show the correspondence between a traditional analysis and a regression for a simple analysis. • In a meta-analysis with no covariates we want to estimate (and test) the effect size. This question is addressed by the mean effect (−0.7141) in the traditional analysis, and by the intercept (−0.7141) in the regression. In both cases the standard error is 0.1787 and the p-value is 0.0001, which tells us that the true mean effect size is probably not zero. • We also want to know if there is evidence that the true effect size varies across studies. This question is addressed by the Q-test for heterogeneity in the traditional analysis, and by the Qtest for the residual in the regression. In both cases the Q-value is 152.2330 with df = 11 and p < 0.0001, which tells us that the true effect sizes probably do vary. • Finally, we want to estimate the variance in true effect sizes. This estimate, called T2, is 0.3088 in both cases. This variation is incorporated into the model, and affects the weights assigned to each study. Subgroups analysis (Case B) Immediately above (Case A) we found that the impact of the vaccine varied from study to study. The researchers hypothesized that this variation might be explained by the fact that studies were conducted in various locations, and that the vaccine was more effective in colder climates. To test this hypothesis we can classify each study as being either “Hot” or “Cold” and then perform an analysis (a) to estimate the effect size in each subgroup of studies, and (b) to compare the effect size for the two subgroups. (The researchers used latitude as a continuous covariate rather than creating two groups). 48 The traditional approach In Figure 21 the studies have been divided into subgroups. • • The six “Cold” studies are at the top, followed by their mean effect, a log risk ratio of −1.1987 [A]. The seven “Hot” studies are at the bottom, followed by their mean effect, a log risk ratio of −0.2784 [B]. A B Figure 21 | Subgroups Cold vs. Hot | Random-effects The same information is displayed in Figure 22. Here, the top section is labeled “Fixed-effect” and reports statistics based on fixed-effect weights. The bottom one is labeled “Mixed-effects” and reports statistics based on random-effects weights within subgroups. (The label “Mixed-effects” refers to the fact that we use the random-effects model within subgroups, but not between subgroups.) In Figure 22 we are working with the section labeled “Mixed effects” (which means that we’re using random-effects weights within subgroups). The lines marked [A] and [B] correspond to lines [A] and [B] in Figure 21, and show the mean effect size, standard error, variance, and confidence interval for the two subgroups. 49 D A B Figure 22 | Subgroups Cold vs. Hot | Random-effects D C Note on computing T2 CMA offers options for computing T2 in the presence of subgroups. Select the option to compute T2 within subgroups and then pool the estimates across subgroups, as shown in Figure 23 and Figure 24. For an explanation of these options, see Appendix 4: Computing τ2 in the presence of subgroups. To apply this option in CMA, on the analysis screen • • • Select Computational options > Mixed and random effects options Select the option to Assume a common among-study variance across subgroups Select the option to Combine subgroup using a fixed-effect model Figure 23 | Option for computing T2 in the presence of subgroups 50 Figure 24 | Option for computing T2 in the presence of subgroups Is the mean effect size zero for each subgroup? For the Cold studies this is addressed by Figure 21 [A] and Figure 22 [A]. The mean effect size is −1.1987 with a standard error of 0.1769, and confidence interval of −1.5445 to −0.8518. The test that the mean is zero is addressed by the z-value of -6.7740 and corresponding p-value of < 0.0001. We would conclude that the mean effect size in the universe of Cold studies is probably not zero. For the Hot studies this is addressed by Figure 21 [B] and Figure 22 [B]. The mean effect size is −0.2784 with a standard error of 0.1522, and confidence interval of −0.5767 to +0.0199. The test that the mean is zero is addressed by the z-value of −1.9289 and corresponding p-value of < 0.0674. This fails to meet the traditional criterion alpha of 0.05, and so by this criterion we cannot reject the null that the effect size is zero (that the vaccine has no impact). Test of the model Is effect size related to subgroup membership? This question is addressed by the section labeled “Mixed-effects analysis”. The lines marked Cold [A] and Hot [B] give the mean effect size for each group, based on random-effects weights. The line marked “Total between” is a test of the difference between these two values (−1.1987 vs. −0.2784). The Q-value for this difference is 15.5445 with 1 df, with a corresponding p-value of 0.0001 [C]. We conclude that effect size probably does differ by subgroups. Heterogeneity Is there any unexplained variance in the true effect sizes? 51 Immediately above, we saw that we can use information about a study’s subgroup to improve our ability to predict that study’s effect. That is, by using the subgroup mean rather than the grand mean to predict a study’s effect size, we are able to make a more accurate prediction. But does subgroup membership enable us to completely predict that study’s effect – do all studies within a subgroup share a common effect size? Or, is there variance in true effects within subgroups? To test the assumption that there is no variance in true effect sizes within subgroups we compute Q and df within subgroups and then sum these values across subgroups as shown in Table 2. This table is taken from the section labeled [D] in Figure 22. Table 2 Cold Hot Total Q 20.3464 21.4431 41.7894 df 5 6 11 p-value 0.0011 0.0015 0.0000 For Q = 41.7894 with 11 df, the p-value is < 0.0001. This tells us that the true effect sizes do vary from study to study, even within subgroups. Put another way, the model is incomplete – knowing whether a study falls into the Cold or Hot subgroup does not allow us to completely predict its effect size. How much variance is there? The Q-values computed immediately above are employed to estimate the variance of true effect sizes (T2) within subgroups. For the Cold subgroup T2 is 0.1383. For the Hot subgroup T2 is 0.0741. The combined estimate (computed within subgroups and combined across subgroups) is not shown on this screen, but is 0.0964 (see Appendix 4: Computing τ2 in the presence of subgroups). In each case the standard deviation of true effect sizes (T) is the square root of the variance. The combined estimate of T is 0.3105. What proportion of the observed variance is true variance? Some of the observed variance within subgroups is due to real differences in effect size, while some reflects sampling error. The I2 value reflects the proportion of this variance that is due to real differences (and thus potentially explainable by covariates). For Cold this value is 75.4256%, and for Hot this value is 72.0189%. The combined estimate is not shown on this screen, but is 73.6775%. This means that most of the within-subgroup variance reflects real differences in study effects (see Appendix 4: Computing τ2 in the presence of subgroups). The regression approach We can perform the same analysis as a meta-regression. Figure 25 shows the screen where we define the model. • • The first covariate is the intercept. The second covariate is the variable called Climate: Hot. This covariate will address the question of whether or not the effect size varies by climate. The sub-designation (Hot) follows the 52 convention that variables are named and coded for the presence of an attribute. Since the variable is called “Hot”, Cold studies will be coded 0 while Hot studies will be coded 1. H Figure 25 | Regression | Climate | Setup I J K L M Figure 26 | Regression | Climate | Main results | Random-effects Figure 26 shows the results of the analysis. 53 It is important to note that the results shown on this page are collated from two separate analyses. • We run one regression with the intercept and climate as predictors. This is the basis for the sections which report on the impact of each covariate [I], the test of the model [J], and the goodness of fit [K]. • We run a second regression with only the intercept. This is the basis for the section that reports the value of T2 with no covariates (that is, the true variance about the grand mean) [L]. • Then we use the estimate of T2 with covariates from [I] and without covariates [L] to compute the proportion of variance explained, or R2 in section [M]. Test of the model Is effect size related to subgroup membership? The question of whether or not effect size is related to subgroup membership is addressed by the section labeled “Test of the model” [J]. The Q-value is 15.5445, and with df = 1 the p-value is 0.0001. These are the same values that we saw in Figure 22 [E]. We conclude that effect size does differ by subgroup. Equivalently (since there is only one covariate in the model), the Z-value for climate is 3.9426 with a pvalue of 0.0001. (The test of climate is based on Z, which is a standardized difference. The test of the model is based on Q, which is a squared index. When there is only one covariate, Z2 is equal to Q. Here, 3.94262 is equal to 15.5445). Goodness of Fit Is there any unexplained variance in the true effect sizes? Immediately above, we saw that we can use information about a study’s subgroup to improve our ability to predict that study’s effect. But does this information enable us to completely predict that study’s effect – do all studies within a subgroup share a common effect size? Or is there variance in true effects within subgroups? This is called a Goodness of fit test, since we can say that the model provides a good for the effects if there is no evidence of unexplained heterogeneity. To address this question we compute Q working with the deviation of each study from its predicted effect, which is −1.1987 for the Cold studies and −0.2784 for the Hot studies. Computed in this way, Q is 41.7894 with 11 df and the corresponding p-value is less than 0.0001 [K]. This tells us that the true effect size varies from study to study, even within subgroups. Put another way, the model is incomplete – knowing whether a study falls into the Cold or Hot subgroup does not allow us to completely predict its effect size. This is the same value we saw in the traditional analysis Figure 22 [G]. How much variance is there? 54 In this same section [K] the program shows that T2, the variance of true effect sizes about the subgroup mean, is 0.0964. It follows that T, the standard deviation of true effect sizes about the subgroup mean is 0.3105. These values refer to the dispersion of true effects within each of the subgroups, and are assumed to be the same for all subgroups. What proportion of the observed variance is true variance? The I2 statistic [K] is 73.6775%, which means that nearly three-fourths of the observed variance that remains (that is, within subgroups) reflects real differences in study effects. Note that the program reports I2 for two separate analyses. The one on line [L] is the proportion of the total variance that represents between-study (true) variance, and can potentially be explained by studylevel covariates. The one on line [K] is the proportion of the within-subgroups variance that represents between-study (true) variance, and can potentially be explained by study-level covariates. Graphic In Figure 27 we’ve plotted all 13 studies, and the regression line. (While the regression line is actually a line that intersects the Cold and Hot columns at specific points, we’ve taken the liberty of drawing a horizontal line at the points of intersection.) The predicted value for the Hot studies is −0.2784 with a standard deviation of 0.3105. If we assume that the true effects are normally distributed about each predicted value we would expect the true effects for Hot studies to fall in the range of −0.2784 plus/minus 1.96 times 0.3105, or −0.8870 to 0.3302. In Figure 27 we have superimposed a normal curve on the study points to reflect this span of true effects [N]. The predicted value for the Cold studies is −1.1199 with a standard deviation of 0.3105. If we assume that the true effects are normally distributed about each predicted value we would expect the true effects for Cold studies to fall in the range of −1.1199 plus/minus 1.96 times 0.3105, or −1.7285 to −0.5113. In Figure 27 we have superimposed a normal curve on the study points to reflect this span of true effects [O]. Note that the plot shows the observed effects. By contrast, the curves are intended to capture some 95% of the true effects, which are assumed to fall closer to the predicted values. 55 N O Figure 27 | Dispersion of effects about the subgroup means Comparison of Model 1 with the null model Returning to Figure 26, the intent of the section labeled “Comparison of Model 1 with the null model” is to report the proportion of variance explained by the model, an index analogous to R2 in primary regression. The index is a ratio of the explained variance to the total variance, and to get both numbers we need to run two separate regressions. To get the initial amount of variance we run a regression with no covariates and compute T2. Here, T2 is 0.3088, which is the variance of all studies about the grand mean [L] To get the variance that remains with the covariates, we run a regression with the covariates and compute T2. Here, T2 is 0.0964, which is the variance of all studies about the regression line [K] Proportion of variance explained [M] If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.0964, then the ratio TRe2 sidual 0.0964 = = 0.6878 2 0.3088 TTotal (1.3) gives us the proportion of variance that is not explained by the covariates. R2, the proportion of variance that is explained by the covariates is then T2 1 − Re 2sidual R2 = TTotal 0.0964 1− 0.6878 . = = 0.3088 (1.4) We show this graphically in Figure 28, which juxtaposes Figure 20 with Figure 27. At left, the normal curve [P] reflects the unexplained variance in effects when the predicted value for each study is the grand mean. At right, the normal curves [Q,R] represent the variance in effects when the predicted 56 value for each study is the corresponding subgroup mean. This is the variance not explained by subgroup membership. The variance at the right (0.0964) is less than the variance at the left (0.3088), which tells us that by using climate as a covariate we can reduce the unexplained variance – or (equivalently) explain some of the variance. 0.0964 0.3088 Q 0.0964 P R Figure 28 | Dispersion about grand mean vs. dispersion about subgroup means Equation (1.3) gives the ratio of the variance at right to the variance at left (the ratio of not explained to total). Then in equation (1.4) we subtract this value from 1.0 to get the value of R2. An equivalent approach to computing R2 is as follows. If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.0964, it follows that the difference (0.2124) is the T2 explained by the model. Then we can compute R2, the proportion explained by the model, as R2 = 2 TExplained 0.2124 = = 0.6878 . 2 TTotal 0.3088 (1.5) Prediction equation The prediction equation [I] is −1.1987 + 0.9203 x Climate. Since Climate is coded 0 for Cold and 1 for Hot, the predicted value for Cold studies is Y =−1.1987 + 0 × 0.9203 =−1.1987 , (1.6) while the predicted value for Hot studies is Y =−1.1987 + 1× 0.9203 =−0.2784 . (1.7) These are the same values that we saw for the subgroup means in Figure 22 [C] and [D]. 57 Summary The Q statistic • The Q-Between (in the traditional model) and the Q-Model (in the regression) are both 15.5445, with df = 1 and p = 0.0001. Each tells us that effect size differs between subgroups. • The Q-within (in the traditional model) and the Q for goodness of fit (in the regression) are both 41.7894, with df = 11 and p < 0.0001. Each tells us that the true effect size varies, even within subgroups. • The Q-total in each case is 152.2330, with df = 12 and p < 0.0001. Each tells us that effect sizes vary when we ignore subgroups and work with deviations of all studies from the grand mean. The I2 statistic • The I2 statistic tells us what proportion of the variation in observed effects reflects variance in true effects rather than sampling error. • When there are no covariates [L] the I2 value is 92.1173%, which tells us that 92% of the observed variance is real, and may potentially be explained by covariates. • When we use climate as a covariate [K] the I2 value is 73.68%, which tells us that some 74% of the observed variance about the subgroup means is real, and may potentially be explained by additional covariates. The R2 Index • The between-study variance is estimated at 0.0964 within subgroups, as compared to 0.3088 for the population as a whole. It follows that the variance (in log units) explained by subgroups is 0.2124. The ratio of explained/total corresponds to an R2 of 0.6878, meaning that 68.7824% of the variance in true effects can be explained by climate. This is reflected in Figure 28, where the range of true effects about subgroup means is smaller than the range of true effects about the grand mean. Continuous covariate (Case C) In the prior analysis we classified each study as either “Cold” or “Hot” based on its absolute distance from the equator. In the original paper, the researchers did not classify the studies as Cold or Hot. Rather, they worked with the absolute latitude of each study as a continuous covariate. We turn to that analysis now. The traditional approach There is no mechanism to work with a continuous covariate in the traditional framework. 58 The regression approach Figure 29 shows the screen where we define the regression using the intercept and latitude to predict the effect size. Figure 30 shows the results of this analysis. Figure 29 | Regression | Latitude | Setup A B C D E Figure 30 | Regression | Latitude | Main results | Random-effects 59 As before, it is important to note that the results shown on this page are collated from two separate analyses. • We run one regression with the intercept and climate as predictors. This is the basis for the sections which report on the impact of each covariate [A], the test of the model [B], and the goodness of fit [C]. • We run a second regression with only the intercept. This is the basis for the section that reports the value of T2 with no covariates (that is, the true variance about the grand mean) [D]. • Then we use the estimate of T2 with covariates from [C] and without covariates [D] to compute the proportion of variance explained, or R2 in section [E]. Prediction equation At the top [A] the program shows the coefficient for the intercept and for each covariate, along with the standard error, confidence interval, and significance test. Test of the model Is effect size related to latitude? In Figure 30 the prediction equation [A] is 0.2595 −0.0292 x Latitude. This question of whether or not effect size is related to latitude is addressed by the section labeled “Test of the model” [B]. The Q-value is 18.8452 with 1 df and p-value of < 0.0001. We conclude that effect size probably is related to latitude. Equivalently (since there is only one covariate in the model), the Z-value for latitude is −4.3411 with a pvalue of < 0.0001. (The test of latitude is based on Z, which is a standardized difference. The test of the model is based on Q, which is a squared index. When there is only one covariate, Z2 is equal to Q. Here, -4.34112 is equal to 18.8452). Goodness of fit Is there any unexplained variance in the true effect sizes? Immediately above, we saw that we can use information about a study’s latitude to improve our ability to predict that study’s effect. But does this information enable us to completely predict that study’s effect? That is, do all studies at the same latitude share a common effect size? Or is there variance in true effects among studies at the same latitude? This is addressed by the Goodness of Fit [C]. To compute a measure of dispersion we work with the deviation of each study from that study’s predicted effect size, where the predicted effect size is a function of each study’s latitude. Computed as a deviation from this predicted value, Q is 30.7331 with 11 df, and the corresponding p-value is 0.0012. This tells us that the true effect size varies from study to study, even within latitudes. Put another way, the model is incomplete – knowing a study’s latitude 60 does not allow us to completely predict its effect size. (Unlike subgroups, we may not have multiple studies at the same latitude, but the idea is the same – for each study we compute the deviation from the prediction line to the observed effect size). How much variance is there? The program [C] shows that T2, the variance of true effect sizes at any point on the regression line, is 0.0633. It follows that the T, the standard deviation of true effect sizes at any point on the regression line is 0.2516. What proportion of the observed variance is true variance? • The I2 statistic tells us what proportion of the variation in observed effects reflects variance in true effects rather than sampling error. • When there are no covariates [D] the I2 value is 92.1173%, which tells us that 92% of the observed variance is real, and may potentially be explained by covariates. • When we use latitude as a covariate [C] the I2 value is 64.21%, which tells us that some 64% of the observed variance about the regression line is real, and may potentially be explained by additional study-level covariates. Graphic In Figure 31 we’ve plotted all 13 studies and the regression line. The Q-statistic was computed by working with the deviation of every study from the regression line. F G H Figure 31 | Dispersion of effects about regression line for latitude 61 The estimate of the variance (T2) is 0.0633 and of the standard deviation (T) is 0.2516. If we assume that these effects are normally distributed about each predicted value we would expect the true effects for all studies to fall at the predicted value plus/minus 1.96 T, or within 0.4931 on either side of the predicted value. This holds true for any point on the regression curve, but for illustrative purposes we have superimposed a normal curve at a few arbitrary points on the regression line [F,G,H] to reflect this range. Note that the plot shows the observed effects. By contrast, the curves are intended to capture some 95% of the true effects, which are assumed to fall closer to the regression line. Comparison of Model 1 with the null model Returning to Figure 30, the intent of the section labeled “Comparison of Model 1 with the null model” is to report the proportion of variance explained by the model, an index analogous to R2 in primary regression. The index is a ratio of the variance explained to the total variance, and to get both numbers we need to run two separate regressions. To get the initial amount of variance we run a regression with no covariates and compute T2. Here, T2 is 0.3088, which is the variance of all studies about the grand mean [C] To get the variance that remains with the covariates, we run a regression with the covariates and compute T2. Here, T2 is 0.0633, which is the variance of all studies about the regression line [D] Proportion of variance explained To get the final amount of variance we run a regression with the covariates and compute T2. This value, reported above [C] as 0.0633, is the variance of studies about their predicted value. If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.0633, then the ratio TRe2 sidual 0.0633 = = 0.7950 2 0.3088 TTotal (1.8) gives us the proportion of variance that is not explained by the covariates. R2, the proportion of variance that is explained by the covariates is then T2 R2 = 1 − Re 2sidual TTotal 0.0633 1− 0.7950 = = 0.3088 (1.9) We show this graphically in Figure 32, which juxtaposes Figure 20 with Figure 31. At left, the normal curve [I] reflects the unexplained variance in true effects when the predicted value for each study is the grand mean. At right, the normal curves [J, K, L] represent the variance in true effects when the predicted value for each study is the corresponding point on the regression line. This is the variance not 62 explained by latitude. The variance at the right (0.0637) is less than the variance at the left (0.3088), which tells us that by using latitude as a covariate we can reduce the unexplained variance – or (equivalently) explain some of the variance. 0.3088 J K I L 0.0633 Figure 32 | Dispersion about grand mean vs. dispersion about regression line An equivalent approach to computing R2 is as follows. If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.0633, it follows that the difference (0.2455) is the T2 explained by the model. Then we can compute R2, the proportion explained by the model, as 2 TExplained 0.2455 R = = = 0.7950 2 TTotal 0.3088 2 (1.10) Summary The Q statistic • The Q-Model is 18.8452 with df = 1 and p < 0.0001. This tells us that effect size is related to latitude. • The Q-value for goodness of fit is 30.7331 with df = 12 and p = 0.0012. This tells us that the true effect size varies, even within studies at the same latitude. • The Q-total is 152.2330 with df = 12 and p < 0.0001. This tells us that effect sizes vary when we ignore latitude and work with deviations of all studies from the grand mean. The I2 statistic • The variance in effect sizes that is observed within a given latitude is partly due to real differences (which can potentially be explained by additional study-level covariates) and partly due to within-study sampling error. When there are no covariates [D] the I2 value is 92.1173%, which tells us that 92% of the observed variance is real, and may potentially be explained by 63 covariates. When we use latitude as a covariate the I2 [C] value is 64.2080%, which tells us that some 64% of the remaining variance (F, G, H in Figure 31] is real, and may potentially be explained by additional covariates. The R2 statistic • The between-study variance is estimated at 0.0633 at any given point on the regression line based on latitude, as compared to 0.3088 for the regression line based on the grand mean. This corresponds to an R2 of 0.7950, meaning that 79.50% of the true variance in effects can be explained by latitude. In context We presented three cases to show the correspondence between regression and a traditional analysis. In Case A there were no covariates; in Case B there was a categorical covariate; and in Case C there was one continuous covariate. In Case A we tested the effect size by looking at the mean (for the traditional analysis) or the intercept (for the regression). In Case B we looked at the relationship between effect size and subgroup (for the traditional analysis), or between the effect size and the covariate (for regression). In Case C we looked at the relationship between effect size and the covariate. For the traditional analysis, to estimate the variance in true effect sizes we computed the Q value based on deviations from the grand mean (Case A); or by the Q value based on deviations from each study’s subgroup mean (Case B). The regression model is a more general model and allows us to cover all cases by saying that we computed the Q-value based on deviations from each study’s predicted value. In Case A the predicted value is the grand mean (as it was for the traditional analysis). In Case B the predicted value is the subgroup mean (as it was for the traditional analysis). In Case C the predicted value is the point on the regression line corresponding to the regression equation. In any case, once we had Q we used it to estimate the true variance for the relevant population (all studies, studies within subgroups, or studies at the same latitude). In addition to estimating the variance of effects (T2) and the standard deviation of effects (T), we were able to report what proportion of the observed variance was real (I2) and what proportion of the original variance was explained by the predictive model (R2). 64 PART 6: META-REGRESSION IN CMA 65 WHAT’S NEW IN THIS VERSION OF META-REGRESSION? • • The prior version of CMA (Version 2) included a module to perform simple regression (one covariate). The current version (Version 3) incorporates a full-fledged regression module, which allows for any number of covariates. Additionally, the new module includes an array of sophisticated options, including the following. • • • • • Define “Sets” of variables, such as a set of covariates that together capture the impact of a categorical variable, or a set of covariates that represent the linear, curvilinear, and cubic relationship of dose with effect size. Automatically create and code dummy variables for categorical covariates. Select from an array of computational options, including the choice to use the Z distribution or the Knapp-Hartung adjustment, to use method of moments (MM), full maximum likelihood (ML), or restricted maximum likelihood (REML) for estimating τ2. The regression plot incorporates many options, including the ability to display both confidence intervals and prediction intervals for the regression line. Export data and residuals to Excel™ for further processing. Advanced options • The program allows you to define two or more prediction models. For example, define one model that includes a series of nuisance variables and another that includes these variables plus variables that represent the treatment. The program displays the proportion of variance explained by each model and also a test that compares models. 66 THE COVARIATES AND THE PREDICTIVE MODEL In the prior chapter our goal was to show the correspondence between meta-regression and a traditional meta-analysis. For that reason we included no more than one predictor in each analysis. Of course, a key strength of meta-regression is that it allows us to include two or more predictors in an analysis, as we will be using multiple predictors in most of the examples that follow. The interpretation of a meta-regression is basically the same as that of regression in a primary study. The analysis will yield a set of statistics for each covariate, as well as set of statistics for the model. The statistics for each covariate reflect the impact of that covariate, with all other covariates held constant. The statistics for the full model reflect the combined impact of all covariates. • If we have covariates A and B and want to know the impact of each covariate ignoring the other (that is, ignoring the potential confound) we would run two analyses. The first analysis would include only A. The second would include only B. • If we have covariates A and B, and we want to know the impact of each with the other held constant, we would run one analysis that includes both A and B. The statistics for each covariate reflect the impact of that covariate with all other covariates partialled, or held constant. The statistics for the model reflect the contribution of A and B as a set. • If we have covariates A and B, and we want to know the impact of each with the other held constant, and also assess the interaction A x B, we would run one analysis that includes A, B, and AB. The statistics for AB give the impact of the interaction over and above the main effects. The statistics for the model reflect the impact of the two main effects plus the interaction. 67 QUICK START 1) On the data-entry screen a) Create a column for study name b) Create a set of columns for the effect size c) Identify one or more columns as “Moderators” and set the subtype to either “Integer”, “Decimal”, or “Categorical d) Enter the data Or, simply start CMA and then open the BCG file. Be sure to use the file BCGP if your computer uses a period to indicate decimal places, or BCGC if it uses a comma for this purpose. 2) On the main analysis screen a) Optionally, select the effect size index b) Optionally, select the studies to be included in the regression c) Optionally, specify how to work with studies that included multiple subgroups, outcomes, timepoint, or comparisons. d) Click Analyses > Meta-regression 2 3) On the regression screen – define the regression a) Select the covariates to be included in the model b) Optionally, define “Sets” of covariates c) Optionally, define multiple models d) Optionally, select [Computational options] e) Run the analysis 4) On the regression screen – navigate the results a) Click the [Fixed] or [Random] tab at the bottom of the screen to select the model b) Click the model name (when several models have been created) c) Use the toolbar to move between the main analysis screen, the scatterplot, diagnostics, increments, model comparisons, and other screens 5) On the regression screen – save the analysis (Optionally) 6) On the regression screen – export the results (Optionally) In this manual we use the BCG data as the motivating example. See Part 1: Data files and downloads for location of files 68 STEP 1: ENTER THE DATA Insert column for study names In Figure 33 [A], click Insert > Column for > Study names. A Figure 33 | Data-entry | Step 01 Figure 34 [B], the program creates a column labeled “Study name”. B Figure 34 | Data-entry | Step 02 69 Insert columns for effect size data In Figure 35, click Insert > Column for > Effect size data. C Figure 35 | Data-entry | Step 03 The program opens a wizard (Figure 36) that allows you to specify the kind of summary data you will enter • • Select <Show all 100 formats> [D] Click [Next] [E] D E Figure 36 | Data-entry | Step 04 70 In the wizard (Figure 37) • • Select the top option button [F] On this screen, Click [Next] [G] F G Figure 37 | Data-entry | Step 05 71 In Figure 38, drill down to • • • Dichotomous (number of events) Unmatched groups, prospective (e.g., controlled trials, cohort studies) Events and sample size in each group [H] Then, click <Finish> Note that we will be entering events and sample size (N) for each group. Some of the texts that use the BCG example report events and non-events rather than events and N. H Figure 38 | Data-entry | Step 06 72 The program creates columns as shown in Figure 39. It also opens a wizard that allows you to label the columns. • • Enter Vaccine/Control as names for the two groups [I] Enter TB/Ok as names for the two outcomes [J] Then, click [Ok] I J Figure 39 | Data-entry | Step 07 The program applies the labels as shown in Figure 40 [K]. K Figure 40 | Data-entry | Step 08 73 Insert columns for moderators (covariates) Next, we need to create columns for the moderator variables. As shown in Figure 41, • Click Insert > Column for > Moderator variable [L] L Figure 41 | Data-entry | Step 09 74 The program opens a wizard (Figure 42) • • • Set the variable name to “Latitude” [M] Set the column function to Moderator [N] Set the data type to Integer [O] Then, click [Ok] M N O Figure 42 | Data-entry | Step 10 75 As shown in Figure 43, Click Insert > Column for > Moderator variable • • • Set the variable name to “Year” Set the column function to Moderator Set the data type to Integer (This is the year the study was performed, not the year of publication) Then, click [Ok] Figure 43 | Data-entry | Step 11 76 As shown in Figure 44, Click Insert > Column for > Moderator variable • • • Set the variable name to “Allocation” Set the column function to [Moderator] Set the data type to [Categorical] This moderator tracks the mechanism utilized to assign people to be vaccinated (or not). The possibilities are random, alternate, and systematic. Then, click [Ok] Figure 44 | Data-entry | Step 12 77 As shown in Figure 45, Click Insert > Column for > Moderator variable • • • Set the variable name to “Climate” Set the column function to [Moderator] Set the data type to [Categorical] This moderator tracks the climate. The possibilities are Cold and Hot. Then, click [OK] Figure 45 | Data-entry | Step 13 78 Customize the screen The program initially displays the odds ratio (Figure 46). • • We want to work with the risk ratio rather than the odds ratio. Additionally, we want to display the risk ratio in log units. Therefore, we need to customize the display as follows. • • Right-click in any yellow column Click <Customize computed effect size display> [A] A Figure 46 | Data-entry | Step 14 79 The program displays this wizard (Figure 47) • • Tick the box for Risk ratio [B] Tick the box for Log risk ratio [B] B Figure 47 | Data-entry | Step 15 80 As shown in Figure 48, we can set Log risk ratio as the default effect size, and also hide the odds ratio • • • In the drop-down box, select Log risk ratio as the primary index [C] Un-check the box for odds ratio [D] Un-check the box for log odds ratio [D] Then click [Ok] C D Figure 48 | Data-entry | Step 16 The screen now looks like Figure 49. Figure 49 | Data-entry | Step 17 81 Enter the data You can enter the data manually, or copy and paste from Excel ™ or another source (see Appendix 1: The dataset) In Figure 50 you enter effect-size data into the white columns [E]. The program automatically computes the values in the yellow columns [F]. E F Figure 50 | Data-entry | Step 18 • • You may continue to add the other moderators, as enumerated in the appendix Or, open the file BCG.cma. There are two versions of this file, one using a period to indicate decimal places and one using a comma. Use the one that corresponds to your computer’s settings. 82 STEP 2: RUN THE BASIC META-ANALYSIS To run the analysis, click [Analysis > Run Analysis] as shown in Figure 51. G Figure 51 | Data-entry | Step 19 83 The main analysis screen The program displays the main analysis screen (Figure 52). The current effect size [A] is “Log risk ratio”. If you want to switch to another effect size, click [Effect measure: Log risk ratio] on the toolbar and select an alternate index. The next few pages outline the main analysis in CMA using the traditional approach. However, this is optional. As soon as you arrive at the main analysis screen (Figure 52) you can click [Analysis > Metaregression 2] to proceed immediately to the regression module. The initial meta-analysis In Figure 52, the <Fixed> tab [B] is selected, so the program is displaying the results for a fixed-effect analysis [C]. The effect size is in log units [A]. A C B Figure 52 | Basic analysis | Fixed-effect | Log risk ratio 84 Click the tab [D] for <Random>. The program [E] displays results for a random-effects analysis. D E Figure 53 | Basic analysis | Random-effects | Log risk ratio 85 Display moderator variables Next, we want to display the moderator variables on the plot. Note that this is optional, and has no effect on the regression. In Figure 54, click View > Columns > Moderators [F] F Figure 54 | Basic analysis | Display moderators The program displays a list of all variables that had been defined as moderators on the data-entry screen. Drag and drop each of the following onto the main screen, to the right of the “p-value” column (Figure 55 [G]). Latitude, Year, Allocation, and Climate. G Figure 55 | Basic analysis | Display moderators 86 The screen should now look like Figure 56. You can right-click on any column and sort by that column. Here, the studies are sorted by latitude. Since the data had been sorted by latitude on the data-entry screen, the program initially displays the studies in that sequence. It appears that the effect size is minor (near 0 in log units) for studies in warmer climates (toward the top) [H] and larger (as extreme as −1.58) for studies in colder climates (toward the bottom) [I]. H I Figure 56 | Basic analysis | Display moderators 87 Display statistics Click <Next table> to display the statistics shown in Figure 57. Using random-effects weights [J], the summary log risk ratio is −0.7141. The Z-value is -3.995 with a corresponding p-value of 0.0001. Thus, we can reject the null hypothesis that log risk ratio is 0.0 (or equivalently, that the risk ratio is 1.0). If we assume that the studies are valid we can conclude that, the vaccine (on average) probably does prevent TB. At the same time, there is also a substantial amount of dispersion in the effect size. Tau-squared [K] is 0.3088 and Tau is 0.5557. To get a general sense of the true dispersion we can assume that the true effects are balanced about the random-effects estimate of the mean effect, and that some 95% of all true effects fall within 1.96 T of this mean. Then (in log units) most true effects fall in the range of −1.8032 to +0.3750. This corresponds to risk ratios of approximately 0.16 (a strongly protective effect) to 1.46 (a harmful effect). It would be very important to understand the reason for this dispersion, and for this purpose we turn to meta-regression. K J Figure 57 | Basic analysis | Display statistics for heterogeneity 88 STEP 3: RUN THE META-REGRESSION At this point we proceed to the meta-regression. On the analysis screen (Figure 58), select Analysis > Meta-regression 2. Note. If you don’t see any regression option you may have a lite or standard version of the program, rather than the professional version. If you see an option for Meta-regression but not Meta-regression 2, you have Version 2 of CMA rather than Version 3. Figure 58 | Run regression | Step 01 89 The Interactive Wizard The program displays the screen shown Figure 59. The interactive wizard will walk you through all the steps in running the regression. To display or hide the wizard, use the Help menu. Figure 59 | Run regression | Step 02 90 Add covariates to the model When you initially open the regression module the program displays the following • • The main screen [A] A list of available covariates [B] A B C Figure 60 | Run regression | Step 03 We need to move the covariates from the wizard [B] onto the main screen [A]. Add variables in the sequence shown here (allocation, year, and latitude) to recreate the example that we use in this text. • • • • • Click “Allocation” on the wizard Click [Edit reference group] and select [Random]. Click [Add to main screen] [C] Click “Year” on the wizard and then click [Add to main screen] [C] Click “Latitude” on the wizard and then click [Add to main screen] [C] 91 The model is shown in In Figure 61. Note that “Allocation” is displayed as two lines [D], which are linked by a bracket. Since allocation is a categorical variable the program automatically creates dummy variables to represent allocation. See next chapter for a full discussion. Tick the check-boxes for all covariates [E] F D Figure 61 | Run regression | Step 04 EE The covariates are controlled by the “Covariates” toolbar [F]. On this toolbar, • • • • [Show covariates] shows or hides the wizard [Remove covariates] allows you to remove a covariate from the main screen [Move up] and [Move down] allow you to edit the sequence of covariates The blue and red checks allow you to add (or remove) checks from a series of check-boxes 92 Set computational options The program allows you to specify various options for the computations Click Computational options to display the menu in Figure 62. Figure 62 | Run regression | Step 05 Each of these options is discussed in Part 10: Computational options. To follow the example in this text, set the options as follows. • • • • • Method for estimating T2 (Method of moments) One-tailed or two-tailed test for p-values (Two-tailed) Confidence level (95%) Display the variance inflation factor (Off) Z distribution or the Knapp-Hartung adjustment for p-values and confidence intervals (Z) 93 Run the regression To run the regression, simply click “Run regression” on the toolbar [A] in Figure 62 A Figure 63 | Run regression | Step 06 94 STEP 4: NAVIGATE THE RESULTS Main results screen (Fixed effect) After you run the regression • • Click [Main Results] [A] Click [Fixed] at the bottom to select the statistical model [B] A B Figure 64 | Main results | Fixed-effect For a full discussion of how to interpret the output for a fixed-effect analysis, see Part 7: Understanding the results. 95 Main results screen (Random effects) After you run the regression • • Click [Main Results] [C] Click [Random] at the bottom to select the statistical model [D] C D Figure 65 | Main results | Random-effects For a full discussion of how to interpret the output for a random-effects analysis, see Part 7: Understanding the results. 96 Difference between the fixed-effect and random-effects displays Under the fixed-effect model we assume that there is one source of sampling error (within-study variance), whereas under the random-effects model we allow that there may be two sources of sampling error (within-study variance and between-study variance). Since the weight assigned to each study is the inverse of the variance, the weight assigned to each study depends on the model. In the following pages, this will be evident in the fact that for the statistics that are presented under both models (such as the effect size and its standard error), the value depends on the statistical model. Perhaps more fundamentally, the statistical model determines what statistics we choose to present. The results screen is quite different for the fixed-effect vs. the random-effects model, reflecting the fact that the model determines what questions we can ask of the data. When we work with the fixed-effect model we assume that all studies share a common effect size. We don’t need to estimate T2 (the between-study variance), since this is assumed to be zero. If T2 is assumed to be zero, we don’t estimate I2 (the ratio of between-study variance to total variance) since this must be zero. Nor do we estimate R2, the proportion of between-study variance explained by the predictors, since this must also be zero. For example, see Figure 71. By contrast, when we work with the random-effects model we allow that the true effect size may vary from one study to the next, and therefore these statistics help us to understand this variation. We can estimate the variation of effects sizes (a) about the grand mean and (b) about the regression line. By comparing the two we can compute R2, the proportion of variance explained by the predictors. We can also compute I2 for each case (with and without covariates), and this tells us what proportion of the observed variance reflects variation in true effect sizes rather than random error. For example, see Figure 76. 97 Plot To display the plot • • • Click [Scatterplot] on the menu bar to navigate to the plot [A] Select [Fixed] or [Random] on the tab at the bottom of the screen [B] To specify the variable for the X-axis, o Right-click on the X-axis label [C] or o Click on the drop-down tool [D] For a full discussion of the plot see page 157. D A B C Figure 66 | Plot 98 Other screens To navigate to other tables of results, click “More results” [A] in Figure 67 and then select any of the following. About the predictive model • • • • Covariance (see page 132) Correlation (see page 133) Diagnostics (see page 125) R-squared graphic (see page 145) About the data included in (or excluded from) the analysis • • All studies (see page 244) Valid studies (see page 244) Statistics that compare different predictive models • • • • Increments (see page 134) Models summary (see page 262) Compare models (detailed) (see page 262) Compare models (p-value) (see page 262) A Figure 67 | Other screens 99 STEP 4: SAVE THE ANALYSIS Once you’ve run a meta-regression you can save the predictive model as shown in Figure 68. • • Click File > Save regression file as … [A] This will save the regression template with an extension of .cmr. A Figure 68 | Save analysis The .cmr file saves the instructions for the analysis, NOT the data. By analogy, programs such as SPSS™, SAS™, and stata™ allow you to save a set of commands in one file and the data in another file. The commands can then be applied to any data file that has the same variables. • • The .cmr file, saved here, is analogous to the command file in the other programs. The .cma file, saved from the data-entry screen, is analogous to the data file in the other programs. The .cmr file saves the following • • • • • The list of covariates The list of models The check-boxes for each model The sets The model names In another session you can open a data file on the main data-entry screen. Then, return to the regression module and click File > Open file to open the .cmr file and re-run the analysis. The .cmr file can be used with the same dataset that was used to create it, or with another dataset that includes the same variables. For example, 100 • • • You may return to the data-entry screen and add new studies You may return to the main analysis screen and edit the study filters You may be working with an entirely different data set that has the same variables as the first one. In any of these cases, navigate to the regression module and click File > Open to open the .cmr file. When you open a .cmr file the program simply recreates the main MR screen as though you had entered it manually. The .cmr file does not save the statistical settings that were in place when the file was created. These include the method employed to estimate T2, the use of Z or Knapp-Hartung, the confidence level, the choice of a one-sided or two-sided test. 101 STEP 5: EXPORT THE RESULTS The program offers two options for exporting the results of any analysis. • Export the results to Excel™. Then, you can perform additional computations within Excel™, and/or format the results and copy them as a table to other programs • Copy the results to the clipboard as a picture. Then, paste this picture into Word™ or any other program. Figure 69 shows an example for the main analysis screen. • • Click [File > Save results as Excel™ file and open] [A] Provide a name for the Excel™ file A Figure 69 | Export results 102 The program creates the Excel™ file shown in Figure 70. Figure 70 | Export results The same idea applies to any screen that displays results. 103 PART 7: UNDERSTANDING THE RESULTS 104 MAIN RESULTS • • • To run the regression, click Run Regression. The program will display the Main Results screen Click [Fixed] or [Random] to select the statistical model The following pages show the results screen for each statistical model. The top of the screen is similar for the two models (Figure 72 and Figure 77). For either model it shows the impact of each covariate with other covariates partialled. The difference between the two screens is that one is based on fixed-effect weights while the other is based on random-effects weights. After that, however, the screens differ in some fundamental ways which reflect the difference in the two statistical models. While this holds true for any predictive model, we’ll use the case of one categorical covariate (Climate) as an example. For the fixed-effect model (Figure 72) the program displays a table similar to the analysis of variance table we see for primary studies. This table includes a row for a) The model : The WSS for the deviation of the subgroup means from the grand mean b) The residual : The WSS for the deviation of all studies from their subgroup means c) The total : The WSS for the deviation of all studies from the grand mean The program does not present any statistics for between-study variance (T2) nor for proportion of between-study variance explained by the model (R2). For studies that share the same predicted value, T2 is assumed to be zero, and so there is no reason to report it. If T2 with the predictive model in place is assumed to be zero, then R2 also has no real meaning (by definition, once we apply the model, no between-study variance remains). By contrast, for the random-effects model (Figure 77) the program does not display an analysis of variance table. The idea of partitioning the WSS only works if the weight for each study is constant. This condition is met under the fixed-effect model since the weights are always the same (based on withinstudy variance), but not under the random-effects model since the weights (which are based also on T2) change when we introduce covariates (and thus the frame of reference for computing T2). For the random-effects analysis we do want to present statistics for T2, and for R2. To do this, we need to run a series of distinct analyses and then collate the results. Specifically, we run one analysis to get T2 with the covariates, and another to get T2 without the covariates. The change in T2 gives us the amount of variance explained by the predictive model, and this value over the original T2 gives us the proportion of variance explained (R2). To highlight the fact that these statistics are coming from separate analyses, the statistics are not presented in a table, but rather in separate sections on the screen 105 MAIN RESULTS, FIXED-EFFECT ANALYSIS To navigate to this screen Click [Run regression] [A] A Figure 71 | Setup 106 The toolbar changes as shown in Figure 72. • • Click “Main results” [B] Click [Fixed] [C] B H E F G D C Figure 72 | Main results | Fixed-effect 107 Test of the model Analysis of variance In section [D] the total WSS (Weighted sum of squares, Q) is partitioned into the following. Model The model [E] is the test that the predictive model explains any of the variance in effect size. Put another way, it asks if the dispersion of effects about the regression line is smaller when the regression line is based on the covariates rather than based solely on the grand mean. Here, Q = 128.2186 with df = 4 and p < 0.0001, so we conclude that the predictive model explains (at least) some of the variance in effect size. Residual The residual [F] is the test that the data are consistent with the model’s assumption of a common effect size for all studies with the same predicted value. The Q value is 24.0144 with df = 8 and p = 0.0023. We conclude that the data are not consistent with the assumptions of the fixed-effect model. Total The total [G] is the test that the variance for the full set of studies (with no predictors) is zero. The Qvalue is 152.2330 with df = 12 and p < 0.0001. In a primary study, the total sum of squares (SS) is the sum of the SS explained by the model and the SS residual. Similarly, in a meta-analysis (with fixed-effect weights) the total weighted sum of squares (WSS) is the sum of the WSS explained by the model and the WSS residual. As shown in Figure 72 [D], WSST =WSS M + WSS RES =152.2330 =128.2186 + 24.0144 (1.11) Similarly, the total df is the sum of the model and the residual df (12=4+8). dfT = df M + df RES = 12 = 4+8 (1.12) Impact of individual covariates [H] In Figure 72, the test of the model [E] is an omnibus test for the full set of covariates. It tells us that the set as a whole is related to effect size. By contrast, the rows at the top [H] address the unique impact of each covariate – that is, the impact of each covariate when all of the other covariates are held constant. Since the effect size is the risk ratio, all analyses are carried out in the log metric and all coefficients are in the log metric. In this example, virtually all predicted effects are less than zero, so 0 is no effect, −1 is a large effect, and −2 is a very large effect (see Figure 73). In this example, therefore, a negative coefficients means that as the covariate gets larger the vaccine is more effective (see Appendix 8: Interpreting regression coefficients ). 108 To understand the direction of the effect size as a function of covariates, it’s helpful to work with the scatterplot as discussed immediately below. Allocation Allocation type is a categorical covariate with three groups (Randomized, Alternate, and Systematic), and therefore is represented by a set of two dummy variables. In Figure 72, the test of this set yields Q = 6.3651 with df = 2 and p = 0.0412. Thus, there is evidence that effect size is related to allocation type. The relationship between allocation type and effect size with other covariates partialled in displayed in Figure 73. For a more specific analysis we can look at each line within the set. Alternate allocation has a coefficient of 0.6320 (the vaccine is less effective in studies that employed alternate allocation as opposed to randomized allocation) and a p-value of 0.0366. Systematic allocation has a coefficient of 0.3062 (the vaccine is less effective in studies that employed systematic allocation as opposed to randomized allocation). However, as will be discussed in the chapter on caveats (page 293), these findings may be due to a confound with other factors. Figure 73 | Plot | Fixed-effect Click on Scatterplot and select Allocation to produce the plot shown in Figure 73. There is a column for each allocation type (Random, Alternate, and Systematic). In each column the program displays the observed effects sizes as well as the summary effect size and the confidence interval for the summary effect size. In this example the “Fixed” tab is selected at the bottom of the screen, so all statistics are based on the fixed-effect model. The reader will note that the summary effect size for the alternate allocation studies falls outside the range of the actual effect sizes for the two studies in this group. This reflects the fact 109 that these means are adjusted for other covariates (and serves as a caution against performing these kinds of adjustments with a small number of studies). Year The coefficient for Year is +0.0235, which means that for every increase of one year the log risk ratio will increase by 0.0235 (vaccine was less effective in later trials). The coefficient plus/minus 1.96 times the standard error (0.0159) yields the 95% confidence interval for the coefficient, which is −0.0076 to +0.0545. The coefficient divided by its standard error yields a Z value of 1.4795, and the corresponding p-value of 0.1390. Thus, when latitude and allocation method are held constant, the relationship between year and effect size is not statistically significant. Figure 74 | Plot | Year | Fixed-effect Click on Scatterplot and select Year to produce the plot shown in Figure 74. The regression line shows that as the Year increases, the effect size moves closer to zero. Since the effect size is a log risk ratio, this means that the treatment effect (when adjusted for the other covariates) declined over the years. The confidence interval shows the range of regression lines that are consistent with the data – in other words, we could pivot the regression line on its axis either counter-clockwise or clockwise (as suggested by the arrows) until it encountered the confidence interval. The uncertainty is such that the true regression line could be either in an upward or downward direction. This corresponds to the p-value of 0.1390 for Year in Figure 72, and the fact that the confidence interval for the coefficient included both negative and positive values (−0.0076 to +0.0545) 110 Latitude The coefficient for latitude is −0.0213, which means that for every increase of one unit (degree) in latitude the log risk ratio will decrease by 0.0213 (vaccine is more effective at greater latitudes). The coefficient plus/minus 1.96 times the standard error (0.0084) yields the 95% confidence interval for the coefficient, which is −0.0378 to −0.0048. The coefficient divided by its standard error yields a Z value of −2.526, and the corresponding p-value of 0.0115. Thus, even when year and allocation method are held constant, the relationship between latitude and effect size is statistically significant. Figure 75 | Plot | Latitude | Fixed-effect Click on [Scatterplot] and select Latitude to produce the plot shown in Figure 75. The regression line shows that as the absolute Latitude increases, the effect size moves further from zero. Since the effect size is a log risk ratio, this means that the treatment effect (when adjusted for the other covariates) increases as we move further from the equator. The confidence interval shows the range of regression lines that are consistent with the data – in other words, we could pivot the regression line on its axis either counter-clockwise or clockwise (as suggested by the arrows) until it encountered the confidence interval. While there is substantial uncertainty, all likely regression lines are in the same (downward) direction. This corresponds to the p-value of 0.0115 for Lattitude in Figure 72, and the fact that the confidence interval for the coefficient includes only negative values (−0.0378 to +0.0048) 111 Summary The model The total Q of each effect size about the grand mean can be partitioned into its component parts – the Q due to the variation in effect size that can be explained by the covariates, and the part that cannot. • • • Model. The Q-value for the model is 128.2186 with df = 4 and p < 0.0001, which tells us that effect size is related to at least one of the covariates. Residual. The Q-value for the residual is 24.0144 with df = 8 and p = 0.0023, which tells us that the assumptions of the fixed-effect model have been violated. Total. The Q-value for the total is 152.23 with df = 12 and p < 0.0001, which tells us that that effect sizes vary when we ignore subgroups and work with deviations of all studies from the grand mean. Individual covariates Where the test of the model is an omnibus test for the full set of covariates, the table at the top addresses the impact of each covariate with all other covariates held constant. with other covariates held constant, • Alternate allocation is associated with a smaller effect size (but see the chapter on caveats). The p-value for allocation is 0.0415. • Studies that fall further from the equator showing more impact of the vaccine. The p-value for year is 0.1390. • Studies that fall further from the equator showing more impact of the vaccine. The p-value for latitude is 0.0115. 112 MAIN RESULTS, RANDOM-EFFECTS ANALYSIS To navigate to this screen Click [Run regression] [A] A Figure 76 | Run regression | Setup 113 The toolbar changes as shown here • • Click “Main results” [B] Click [Random] [C] B C Figure 77 | Main results | Random-effects 114 The results presented in Figure 78 are based on three separate analyses. Each of these analyses yields specific items of information, which are pulled together on this screen. D E F G Figure 78 | Main results | Random-effects The differences among the three analyses are shown in Table 3. Table 3 Section D E F Function Random-effects estimates Variance not explained by model Original variance Covariates Yes Yes No Weights V+T2 V V Section D reports statistics for an analysis that employs random-effects weights and includes the covariates. This provides a test of the model, and is also the analysis used in the table at the top of the screen. 115 Section E reports statistics for an analysis that includes the covariates but assigned weights based on V. This provides a goodness-of-fit test. Specifically, we use this analysis to estimate the residual T2, the variance not explained by the covariates. Section F reports statistics for an analysis that does not include covariates and assigns weights based on V. This allows us to estimate the original T2, the total amount of variance. Section [G] is based on the analyses in sections [E] and [F]. Section [E] gives us the variance that cannot be explained by the covariates, while section [F] gives us the total variance. We can use these to compute the ratio of explained to total, which is presented in section [G] (See Part 8: The R2 index). 116 Test of the model [D] Is effect size related to the covariates? The test of the model is a simultaneous test that all covariates (except the intercept) are zero. The Qvalue is 13.1742 with df = 4 and p = 0.0105. We reject the null and conclude that at least one of the covariates is probably related to effect size. Goodness of fit [E] Is there any unexplained variance in the true effect sizes? Immediately above, we saw that the covariates improve our ability to predict that study’s effect. But does this information enable us to completely predict that study’s effect – do all studies with the same values on all covariates share a common effect size? Or is there variance in true effects among studies with the same predicted value? The Q statistic, based on the deviation of each study from its predicted value, is 24.0144, with 8 df and a corresponding p-value of 0.0023. This tells us that the true effect size probably varies from study to study, even for studies that are identical on all covariates. Put another way, the model is incomplete – knowing a study’s allocation type, year, and latitude does not allow us to completely predict its effect size. How much variance is there? In section [E] the program shows that T2, the variance of true effect sizes at any point on the regression line, is 0.1194. It follows that the T, the standard deviation of true effect sizes at any point on the regression line is 0.3455. We can use this to get a sense of how closely the true effects at any point on the regression line are (or are not) clustered together. In Figure 31 we’ve plotted all 13 studies, the regression line, and a series of normal curves about the regression line. Each normal curve is centered at some point on the regression line, and extends 1.96 T on either side of that line. If the true effects are normally distributed with standard deviation T, then 95% of studies with that predicted value will have a true effect size within the range of the normal curve. For example, consider the normal curve labeled [N]. The regression line crosses the Y axis at −1.5. If we were to run many studies at this latitude, the mean effect in these studies would be −1.5. However, the true effect size in any single study would typically fall somewhere above or below this value. The normal curve tells us that 95% of these studies would have true effects in the range indicated by the curve, approximately from −1.0 to −2.0. The decision to display normal curves at three specific points is arbitrary. These curves could have been placed at any points on the regression line. 117 L M N Figure 79 | Dispersion of effects about regression line for latitude What proportion of the observed variance is true variance? The variance of the observed effects about the regression line incorporates both within-study variance (error) and between-study variance (that can be potentially explained by additional study-level covariates). The I2 statistic [E] is 66.69%, which tells us that some 67% of the variance of observed effects about the regression line falls into the latter group. A useful way of using I2 is to help us understand what the distribution of effects would look like if we could plot the true effects rather than the observed effects. An I2 of 67% tells us that the variance of the distribution would shrink by about one-third. The problem with this number is that it’s in square units, and it’s not intuitive what it means that the variance will shrink by a third. It might be more intuitive to work with the square root of I2, (I), which is 0.8166. If we were looking at the true scores rather than the observed scores, the dispersion of effects about the regression line (in linear units) would shrink by about 18%. Impact of individual covariates The test of the model [D] is an omnibus test for the full set of covariates. It tells us that at least one of the covariates is probably related to effect size. By contrast, the table at the top [F] addresses the impact of each covariate with all of the other covariates partialled (or held constant). Since the effect size is the risk ratio, all analyses are carried out in the log metric, and all coefficients are in the log metric. In this example, virtually all predicted effects are less than zero, so 0 is no effect, −1 is a large effect, and −2 is a very large effect. In this example, therefore, a negative coefficient means that as the covariate gets larger the vaccine is more effective. (The reverse would be true if the predicted values were all positive). 118 To understand the direction of the effect size as a function of covariates, it’s helpful to work with the scatterplot as discussed immediately below. Allocation Allocation type is defined as a set of two covariates. The test of the set yields Q = 1.5402 with df = 2 and p = 0.46, and so there is no evidence that effect size is related to allocation type. For a more specific analysis we can look at each line within the set. • • Alternate allocation has a coefficient of 0.4855 (the vaccine is less effective in studies that employed alternate allocation as opposed to randomized allocation) and a p-value of 0.3127. Systematic allocation has a coefficient of 0.4574 (the vaccine is less effective in studies that employed systematic allocation as opposed to randomized allocation) and a p-value of 0.2260. None of these p-values is statistically significant. Figure 80 | Plot | Allocation method | Random-effects Click on Scatterplot and select Allocation to produce the plot shown in Figure 80. There is a column for each allocation type (Random, Alternate, and Systematic). In each column the program displays the observed effects sizes as well as the summary effect size and the confidence interval for the summary effect size. In this example the “Random” tab is selected at the bottom of the screen, so all statistics are based on the random-effects model. 119 The predicted effect size for studies that employed alternate allocation (center column) or systematic allocation (right-hand column) is closer to zero than the predicted effect size for studies that employed randomized allocation (left-hand column). As above, none of these differences is statistically significant. Year The coefficient for Year [H] is 0.0148, which means that for every increase of one year the log risk ratio will increase by 0.0148 (the vaccine became less effective over time). The corresponding p-value is 0.5225. Figure 81 | Plot | Year | Random-effects Click on [Scatterplot] and select Year to produce the plot shown in Figure 81. The regression line shows that as the Year increases, the effect size moves closer to zero. Since the effect size is a log risk ratio, this means that the treatment effect (when adjusted for the other covariates) declined over the years. The confidence interval shows the range of regression lines that are consistent with the data – in other words, we could pivot the regression line on its axis either counter-clockwise or clockwise (as suggested by the arrows) until it encountered the confidence interval. The uncertainty is such that the true regression line could be either in an upward or downward direction. This corresponds to the p-value of 0.5225 for Year in Figure 78, and the fact that the confidence interval for the coefficient included both negative and positive values (−0.0306 to +0.0603). 120 Latitude The coefficient for latitude [I] is −0.0190, which means that for every increase of one unit (degree) in latitude the log risk ratio will decrease by 0.0190 (vaccine is more effective at greater latitudes). The coefficient plus/minus 1.96 times the standard error (0.0159) yields the 95% confidence interval for the coefficient, which is −0.0503 to 0.0122. The coefficient divided by its standard error yields a Z value of −1.1924, and the corresponding p-value of 0.23. Thus, when year and allocation method are held constant, the relationship between latitude and effect size not is statistically significant. Figure 82 | Plot | Latitude | Random-effects Click on [Scatterplot] and select Latitude to produce the plot shown in Figure 82. The regression line shows that as the absolute Latitude increases, the effect size moves further from zero. Since the effect size is a log risk ratio, this means that the treatment effect (when adjusted for the other covariates) increases as we move further from the equator. The confidence interval shows the range of regression lines that are consistent with the data – in other words, we could pivot the regression line on its axis either counter-clockwise or clockwise (as suggested by the arrows) until it encountered the confidence interval. There is substantial uncertainty in the coefficient. This corresponds to the p-value of 0.2331 for Lattitude in Figure 78, and the fact that the confidence interval for the coefficient (−0.0503 to +0.0122) includes the null value, zero. 121 In this example none of the individual covariates has a p-value less than 0.05. Since the model as a whole is statistically significant, the fact that no covariate is statistically significant probably reflects the fact that some of the covariates are correlated with each other. For example, latitude or year might be statistically significant if entered into the equation alone. However, if the two are correlated with each other and compete to explain the same variance, neither has a unique impact that meets the threshold for statistical significance. Comparison of Model 1 with the null model We want to report what proportion of variance is explained by the predictive model, and for this purpose we need to know how much variance there was initially (with no covariates). For this reason we run a regression with no covariates (the null model) and compute T2. Here, T2 is 0.3088, which is the variance of all studies about the grand mean. Proportion of variance explained To get the final amount of variance we run a regression with the covariates and compute T2. This value, reported above as 0.1194, is the variance of studies about their predicted value. • • • With no covariates in the model [F] the unexplained variance (T2 ) is 0.3088 With covariates in the model [E] the unexplained variance (T2 ) is 0.1194 The difference between these values is the variance explained by the model, or 0.1894 If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.1194, then the ratio TRe2 sidual 0.1194 = 2 0.3088 TTotal (1.13) gives us the proportion of variance that is not explained by the covariates. R2, the proportion of variance that is explained by the covariates is then TRe2 sidual 0.1194 R = 1− 2 1− 0.6133 . = = 0.3088 TTotal 2 (1.14) In Figure 78 this is on the line labeled [G]. We show this graphically in Figure 83. At left, the normal curve [O] reflects the unexplained variance in true effects when the predicted value for each study is the grand mean. At right, the normal curves [P, Q, R] represent the variance in true effects when the predicted value for each study is the corresponding point on the regression line. This is the variance not explained by latitude. The variance at the right is less than the variance at the left, which tells us that by using latitude as a covariate we can reduce the unexplained variance – or (equivalently) explain some of the variance. 122 T2 = 0.1194 T2 = 0.3088 O P Q R Figure 83 | Dispersion of effects about two regression lines An equivalent approach to computing R2 is as follows. If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.1194, it follows that the difference (0.1894) is the T2 explained by the model. Then we can compute R2, the proportion explained by the model, as R2 = 2 TExplained 0.1894 = = 0.6133 . 2 TTotal 0.3088 (1.15) Summary The full model • The Q-Model is 18.85 with df = 1 and p < 0.0001. This tells us that effect size is related to (at least some of) the covariates. • The Q-value for goodness of fit is 30.73 with df = 12 and p = 0.0012. This tells us that the effect size varies, even within studies that share the same value on all covariates. • The Q-total is 152.23 with df = 12 and p < 0.0001. This tells us that effect sizes vary when we ignore the covariates and work with deviations of all studies from the grand mean. The I2 statistic • The observed variance in effect sizes is partly due to real differences and partly due to withinstudy sampling error. When there are no covariates [F] the I2 value is 92%, which tells us that 123 92% of the observed variance is real, and may potentially be explained by covariates. When we use these covariates [E] the I2 value is 66.69%, which tells us that 66.69% of the remaining variance is real, and may potentially be explained by additional covariates. The R2 statistic • The between-study variance is estimated at 0.1194 at any given point on the regression line based on these covariates, as compared to 0.3088 for the regression line based on the grand mean. This corresponds to an R2 of .6133, meaning that 61.33% of the variance in true effects can be explained by the covariates. 124 DIAGNOSTICS To navigate to the diagnostics screen Run the analysis [A] A Figure 84 | Setup The toolbar changes as shown here • • Click More > Diagnostics [B] Select the statistical model (Fixed or random) from the tabs at the bottom B Figure 85 | Diagnostics 125 Observed value This is simply the observed effect size The Predicted Value The predicted (fitted) value, Tˆi , for the ith study is the value obtained from using the estimated regression coefficients b0, b1, …, bp and the covariate values for the ith study xi1, …, xip to compute the value of the effect size predicted for that study by the regression model Tˆi = b0 + b1 xi1 + + bp xip . (1.16) The Residual The (unstandardized) residual value, ei, for the ith study is the difference between the observed value and the fitted value e=i Ti − Tˆi . (1.17) If ei = 0, the fitted value and the observed value are identical (the fitted value is exactly on the regression line or plane), but if ei is far from 0, the predicted value is not close to the observed value. In meta-analysis, effect sizes and their fitted values from different studies can have very different sampling uncertainties (standard errors). This makes it difficult to interpret differences in the magnitude of residuals from different studies. Standardized or jackknifed residuals attempt to address this problem of comparability by dividing the residual by its standard error. Studentized Residual The studentized residual value, esi, for the ith study is the residual divided by its standard error esi = ei . SE ( ei ) (1.18) The standard error of esi is given by SE ( ei ) = 1 − hi , wi (1.19) where wi is the weight given to the ith effect size in the analysis, s2 is the weighted residual mean square, and hi is the leverage of the ith effect size. Therefore the residual divided by its standard error (the ith studentized residual) is 126 wi 1 − hi essi = ei (1.20) It is important to note that the standard error of the residual depends on both the residual variance which is determined by the conditional variance of the estimate (and the random effects variance component in random effects models) and the configuration of predictors (including the values for the ith study). Studentized residual esi is on a standard scale, so the values from different studies are more comparable than those of the unstandardized residuals (the ei). If the regression model is correctly specified, the esi have approximately a normal distribution with unit standard deviation, so that esi values greater than 2 in absolute value occur only about 5% of the time by chance and values greater than 2.5 are quite unusual. The actual sampling distribution of esi will often be closer to Student’s tdistribution with k – Q degrees for freedom, so slightly larger reference values (than 2 and 2.5) may be appropriate for judging extremeness of residuals when k – Q is small. Jackknifed Residual The jackknifed residual eji, is similar to the studentized residual in that it is standardized. However the jackknifed residual is the difference between the observed effect size in the ith study and the fitted value of the ith study computed with the ith study deleted from the dataset. That is, eji = Ti − Tˆ(i )i SE T − Tˆ ( i (i )i ) , (1.21) where Tˆ( i ) i is the fitted value of the ith study computed from all other studies except the ith study. To be precise Tˆ(i )i = b(i )0 + b(i )1 xi1 + + b(i ) p xip , (1.22) where b(i)0, …, b(i)p are the regression coefficients estimated with the ith study removed from the dataset. The jackknifed residual is designed to better reveal cases where the ith study does not fit the same model as the other studies. By removing the (potentially distorting impact of the ith study from the computation of the regression coefficients, the jackknifed residual sometimes makes it easier to see how different an observed effect size is from what is expected if that study fit the meta-regression model that is appropriate for all of the other studies. Let the weight of the ith study computed using the variance component estimate with ith study removed be denoted wi(i) , then the ith jackknifed residual is equivalent to eji = ei wi (i ) 1 − hi . (1.23) 127 The sampling distribution of the jackknifed residual is similar to that of the studentized residual (approximately normal) and similar reference values for judging extremeness are appropriate. The actual sampling distribution of eji will often be closer to Student’s t-distribution with k – Q – 1 degrees for freedom, so slightly larger reference values (than 2 and 2.5) may be appropriate for judging extremeness of residuals when k – Q – 1 is small. Leverage Leverage is a diagnostic that reveals how much potential influence a particular study can have on the result of the meta-regression. Let hi be the leverage of the ith study. The values of the leverage are always between zero and one inclusive, that is, 0 ≤ hi ≤ 1. The sum of the leverages h1 + … + hk = Q, where Q is the total number of predictors including the intercept (that is Q = p +1 when there is an intercept in the model and Q = p if there is no intercept). Thus the average value of the leverage is Q/k, and estimates of regression coefficients are most efficient when all the leverage values are close to Q/k. If hi = 0, this implies that the fitted (predicted) value of the effect size for the ith study would be the same even if that study were not part of the data used to estimate the regression coefficients. In one sense, this implies minimal influence. If hi = 1, this implies that the fitted value of the ith study could not be estimated without the data from that study, in other words, the fitted value of that study depends entirely on data from that study. This latter situation is equivalent to saying that there is a regression coefficient (or linear combination of regression coefficients) whose estimate is determined entirely by the data from ith study. In other regression contexts, reference value of 2q/k as been suggested as indicating a study of high leverage. The term leverage arises from a mechanical analogy. Imagine a scatterplot of the effect size versus one predictor. In this one predictor situation, the studies that have x (predictor) values that are far from the center of the data will have high leverage because moving them up or down would have large influence of the regression slope. When there is more than one predictor, there may be studies whose combination of predictor values is far from the center in a multivariate sense. The leverage diagnostic may reveal such multivariate outliers that are not obvious from looking at predictors one at a time. Cook’s Distance Cook’s distance, Di, for the ith study is a measure of how much the estimated regression coefficients change (on the average) when the ith study is deleted from the dataset. Like the studentized and jackknifed residuals, Di is standardized, but unlike them it is in a squared (distance-squared) metric. One can think of Di as the squared difference between b the vector of regression coefficient estimates estimated from all studies and the vector b(i) of regression coefficients estimated from all studies except the ith study, divided by the variance of b that is b − b )′ V ( b − b ) (= −1 = Di (i ) (i ) p +1 wi hi ei2 q (1 − hi ) 2 , (1.24) 128 where V is the covariance matrix of b. In other regression contexts, the value 4/(k – Q) been suggested to help identify studies that have large influence. DFITTS DFITTS is a diagnostic that describes the change in the fitted (predicted) value of the ith study that would arise as a consequence of deleting the ith study from the data to compute the regression coefficients used to compute the fitted value. DFFITS is defined as DFFITSi =Tˆi − Tˆ(i )i =ei wi hi (1 − hi ) 2 , (1.25) where Tˆi = b0 + b1 xi1 + + bp xip (1.26) Tˆ(i )i = b(i )0 + b(i )1 xi1 + + b(i ) p xip , (1.27) and where b(i)0, …, b(i)p are the regression coefficients estimated with the ith study removed from the dataset. Like the jackknifed residual, DFFITS is designed to better reveal cases where the ith study does not fit the same model as the other studies. By removing the (potentially distorting impact of the ith study from the computation of the regression coefficients, the jackknifed residual sometimes makes it easier to see how different an observed effect size is from what is expected if that study fit the meta-regression model that is appropriate for all of the other studies. In other regression contexts, the reference value 2 q n has been suggested for identifying studies with potentially large impact on fitted values. Variance The variance vi of the ith study is the conditional (estimation error) variance of the effect size in the ith study. Because vi depends on the sample size in each study, vi can vary substantially across studies. Tau Squared Tau-squared, τ2, is the estimate of between-studies variance among effect size parameters at any point on the prediction line. An assumption of the meta-regression is that the true variance of effect sizes is the same for all values of the covariate. Sum 129 Sum is the total variance of the ith effect size, which is vi in fixed effect meta-regression or τ2+ vi in random effects meta-regression. Weight The weight of the ith study, wi, is the actual (raw) weight assigned to this study in the analysis, namely the reciprocal of the total variance, namely wi = 1/vi in fixed effects meta-regression and wi = 1/( τ2+ vi) in random effects meta-regression. Percent Weight Percent weight for the ith study is the percentage of the total weight accorded to study, that is wi divided by the sum of all study weights. How to use the diagnostics Regression diagnostics are designed to be simple checks that reveal important features of the data and the regression model fitted to that data. However, the multivariate situation is complex, and diagnostics are typically imperfect. For example, consider the important feature of colinearity (correlations) among predictors. It is well known that colinearity can degrade the quality of estimates of regression coefficients by increasing their sampling uncertainty. This can occur when two predictors are highly correlated but independent of the others, or when there is a high multiple correlation among predictors (when one predictor is almost a linear combination of several others). These two situations have different implications for the quality of regression estimates. In the former case, only the two coefficients corresponding to the correlated predictors may be poorly estimated. In the latter case, the impact of colinearity may affect more coefficients. A diagnostic designed to reveal colinearity, in general, may be not be able to distinguish between the two types of colinearity. On the other hand, producing diagnostics tailored to a myriad of possible special cases increases the complexity of the suite of diagnostics, defeating the purpose of simple checks on data and the regression model. We have implemented a set of diagnostics that have proven most useful in regression problems generally and adapted them to meta-regression. All of these diagnostics are related, in that they are different ways of looking at the extent to which the data associated with a study is inconsistent with the meta-regression model that fits the other studies. They approach the problem in different ways however. • • • The leverage and Cook’s distance focus on the impact of a study on the estimated regression coefficients. The residual and studentized residual focus on the difference between fitted (predicted) effect sizes based on all the data and the observed effect size in each study. DFITTS and the jackknifed residual focus on the difference between fitted values of each effect size when the regression coefficients are estimated with and without a particular study. It is important to recognize that, because these diagnostics are closely related, it is not surprising that the same studies may be flagged by several of the diagnostics as having high impact or influence. In fact, it more surprising (but not impossible) when a study is flagged by only one of them. 130 The diagnostics should not be used by themselves to exclude studies from inclusion in a meta-analysis. The diagnostics are intended to help identify studies that have substantial impact on the estimated regression coefficients. The reference values we have given are not intended to be used like critical values in a significance test, but as criteria for further evaluation. Just because a study has high impact on the analysis does not make it incorrect. However, it is useful to know that a certain study has (or a few studies have) substantial impact on the results. In such cases it is crucial to be sure of the integrity of the studies with high impact. It is also important to know that the impact of a study may change when the set of covariates in the meta-regression is changed or the set of studies is changed (e.g., when a subset of studies is examined). A study that has high impact may have much less impact when a certain covariate is removed from the covariate set or a certain study is removed from the dataset. Variance Inflation Factor The variance inflation factor VIFj for the jth covariate is a diagnostic designed to provide information about the colinearity of the covariate set. One of the consequences of colinearity is that it increases the variance (the square of the standard error of) the regression coefficient estimates. If the standard error of a regression coefficient estimate is too large, it may be difficult to meaningfully interpret that estimate. For example, suppose that a particular coefficient expresses the difference between the average of two groups of standardized mean difference effect sizes, that the coefficient estimate is 1.0 and the standard error of that coefficient is 4. In such a case, it is difficult to draw an informative conclusion because the results imply a 95 percent confidence interval for the mean difference between group mean effects of -3 to +5, a range which is consistent with substantially different substantive conclusions. The VIFj indicates how much greater the variance of the regression coefficient estimate bj for the jth covariate is than it would have been if the covariates were totally uncorrelated. A VIF value of 4 for a particular covariate indicates that the standard error of that regression coefficient is twice as large as it would have been if the covariate were uncorrelated with all the other covariates. A high VIFj value for the jth covariate does not necessarily mean that the standard error of the coefficient of that covariate is too large for the estimate to be meaningful. For example, suppose that a particular coefficient expresses the difference between the average of two groups of standardized mean difference effect sizes, that the coefficient estimate is 1.0, and the standard error of that coefficient is 0.1 In such a case, one can still draw an informative conclusion even if VIF = 4 because the results imply a 95 percent confidence interval for the mean difference between group mean effects of 0.8 to 1.2, a range which is consistent with substantially the same substantive conclusion of a very large difference between the group mean effect sizes. Note that VIFj is not a property of the jth covariate alone, but depends on all of the other covariates as well. Therefore removing one covariate from the covariate set may change (sometimes drastically) the VIF values of several other covariates. 131 COVARIANCE To navigate to this page click More results > Covariance [A] A B Figure 86 | Covariance matrix This page gives us the covariance matrix of the B values. To understand what these covariances represent, imagine that we draw a sample of studies, run the regression, and get an estimate of BYear and BLatitude. We repeat this process j times, and each time get an estimate of BYear and BLatitude. Then, we compute the covariance of BYear with BLatitude over the j samples. This covariance would be 0.0003 [B] The same idea applies to all cells in the matrix. 132 CORRELATIONS To navigate to this page click More results > Correlation [A] A B Figure 87 | Correlation matrix This page gives us the correlation matrix of the B values. To understand what these correlations represent, imagine that we draw a sample of studies, run the regression, and get an estimate of BYear and BLatitude. We repeat this process j times, and each time get an estimate of BYear and BLatitude. Then, we compute the correlation of BYear with BLatitude over the j samples. This correlation would be 0.8444 [B] The same idea applies to all cells in the matrix. When the correlation between two covariates is high (close to 1.0 or close to −1.0), this tells us that the two are highly confounded, and it is therefore difficult to isolate the unique impact of each. In this example, this is probably why (using random-effect weights) latitude was statistically significant when used by itself, but not when used in conjunction with year. 133 INCREMENTS We use the term “increments” to refer to the process of adding one covariate at a time to the model, and studying the change in the variance explained. This approach provides some kinds of information that are not available in a single analysis that includes all covariates. For example, consider the [Main results] screen shown in Figure 88. Figure 88 | Main results | Random-effects The test of the model, goodness of fit, estimates of T2 and R2, apply to the full model (Allocation, Year, and Latitude). Suppose we want to know these statistics if (a) we include only allocation, (b) we include allocation and year, and (c) we include allocation, year, and latitude. One way to get this information is to actually run a series of analyses, adding one covariate at each iteration. While the idea of running a series of analyses will work, it can be a tedious process, and also requires the researcher to collate the results of all the analyses. To address this problem the program has automated the process. When you define a model with covariates X, Y, Z, the program will run an analysis with X, another with X and Y, and another with X, Y, and Z. It will then collate the results, showing the key statistics at each iteration. Additionally, it shows the change in T2 and in R2, as well as a statistical test for the change, at each iteration. 134 To make it clear how the increments work, we’re going to run a series of analyses and present the results for each. Then we’ll use these to understand the information on the increments screen. In practice, of course, you would need to run only one model (the one with all the covariates) and then jump directly to the increments screen. First, we include only the intercept. In Figure 89 we have added all the covariates to the main screen, but only the box for the intercept is ticked, so this will be the only covariate included in the analysis. Figure 89 | Setup | Intercept only 135 A B C Figure 90 | Main results | Intercept only With only the intercept in the model (Figure 90), • • • For variance explained by the model Q = 0.0000, df = 0, p = 1.0000 [A] For variance unexplained by the model Q = 152.2330, df = 12, p= 0.0000 [B] R2 for the model is 0.00 [C] 136 In Figure 91 we add tick-marks for the two dummy variables that represent Allocation Figure 91 | Setup | Intercept + Allocation A B C Figure 92 | Main results | Intercept + Allocation With intercept + allocation in the model (Figure 92) 137 • • • For variance explained by the model Q = 1.4349, df = 2, p= 0.4880 [A] For variance unexplained by the model Q = 132.3676, df = 10,p = 0.0000 [B] R2 for the model is 0.00 [C] 138 In Figure 93 we add a tick-mark for Year and re-run the analysis Figure 93 | Setup | Intercept + Allocation + Year A B C Figure 94 | Main results | Intercept + Allocation + Year With intercept + allocation + year in the model (Figure 94) 139 • • • For variance explained by the model Q = 10.7159, d f= 3, p=0.0134 [A] For variance unexplained by the model Q = 30.3951, df = 9, p = 0.0004 [B] R2 for the model is .56 [C] 140 Finally, in Figure 95, we add a tick mark for latitude and re-run the analysis Figure 95 | Setup | Intercept + Allocation + Year + Latitude A B C Figure 96 | Main results | Intercept + Allocation + Year + Latitude With intercept + allocation + year + latitude in the model (Figure 96) 141 • • • For variance explained by the model Q = 13.1752, df = 4, p = 0.0105 [A] For variance unexplained by the model Q =24.0144, df = 8,p = 0.0023 [B] R2 for the model is .6133 [C] 142 Alternatively, we could have jumped directly to the full model, run the analysis, and gone to the increments page. To navigate to this page — • • • Run the analysis that includes all the covariates (Figure 95) Click More results > Increments Select the statistical model tab (Fixed or random) A C A B AA CC AA Figure 97 | Main results | Intercept + Allocation + Year + Latitude Every row in this table copies information from a separate analysis. • • • • The row labeled Intercept copies information from Figure 90. The row labeled Allocation (the second row labeled Allocation, since this is a set) copies information from Figure 92. The row labeled Year copies information from Figure 94. The row labeled Latitude copies information from Figure 96. Each column in this table corresponds to a section in the prior figures • • • [A] columns copy information such as T2 from [A] in the prior screens [B] columns copy information about goodness of fit from [B] in the prior screens [C] columns copy information about R2 from [C] in the prior screens Additionally, this table presents information about change from one row to the next 143 • • [AA] columns show the change in T2 and in the test of significance [CC] column shows the change in R2 Suppose we want information about the model that includes allocation and year. On the line labeled Year, we see that T2 is 0.1349, R2 is .5631, the model is statistically significant (Q = 10.72 with df = 3 and p = 0.0134) but fails to explain all the variance (Q = 30.40, df = 9, p = 0.0004). These statistics were copied from the analysis in Figure 94. If we return to that figure, we’ll see the same numbers. The columns labelled [AA] and [CC] are unique to this screen, and address the change as we move from one model to the next. The column labeled “Change from prior” gives the change in T2 and in R2. The column labeled “Test of change” is the corresponding test of statistical significance. For example, consider the line labeled “Year”. The table shows that T2 changed by −0.4247 (which is the difference between 0.5596 on the prior line and 0.1349 for the current line). It shows that R2 changed by 56.31% (which is the difference between 0.00% on the prior line and 56.31% on the current line). It shows the statistical test for the change yields Q = 8.43, df = 1, p = 0.0037. Note that the test for change corresponds to the impact of each covariate at the point that it is entered into the model. Thus, • The change for Allocation corresponds to the impact of allocation with no covariates held constant. The p-value of 0.4880 in Figure 97 corresponds to the p-value of 0.4880 in Figure 92. • The change for Year corresponds to the impact of Year with allocation held constant. The pvalue of 0.0037 in Figure 97 corresponds to the p-value of 0.0037 in Figure 94. • The change for Latitude corresponds to the impact of Latitude with Year and allocation held constant. The p-value of 0.2331 in Figure 97 corresponds to the p-value of 0.2331 in Figure 96. Note. The tests in the earlier screens report Z rather than Q. In each case you could square the Z-value on the earlier figure to get the Q-value in Figure 97. 144 PART 8: THE R2 INDEX R2 is the proportion of between-studies variance explained by the model. It is analogous to the R2 index commonly reported for the proportion of variance explained by covariates in primary studies. Consider the example shown in Figure 98. We’ve used latitude as the sole covariate. Figure 98 | Setup 145 C B A Figure 99 | Main results | Latitude | Random-effects 146 In Figure 99 [A] the program shows that the proportion of variance explained by latitude is 0.79. Before turning to the computation, let’s take a moment and get an intuitive sense of what this means. T2 = 0.3088 T2 = 0.0633 Figure 100 | Dispersion of effects about grand mean vs. dispersion of effects about regression line Figure 100 includes two plots, as follows. The left-hand plot shows the dispersion of all studies about a regression line where there are no covariates. The predicted effect size for each study is the grand mean, and the variance of true effect sizes about the mean (T2) is 0.3088. We’ve superimposed a normal curve based on T, and it covers about 95% of all true effects in this population. The right-hand plot shows the dispersion of all studies about a regression line based on latitude. The predicted effect size for each study is the regression line, and the variance of true effect sizes about the regression line (T2) is 0.0633. We’ve superimposed a series of normal curves based on T. At any point on the regression line, the curve covers about 95% of all true effects for studies at that latitude. The normal curves in the right-hand plot are smaller than the normal curve in the left-hand plot. This reflects the fact that by using the regression line at the right to predict effect sizes we are able to make better predictions. In fact, we are able to reduce the unexplained variance by 79.5%. The R2 value describes this reduction in unexplained variance. At this point we can return to the screen in Figure 99. • To get the initial amount of variance we run a regression with no covariates and compute T2. Here, T2 is 0.3088 [B], which is the variance of all studies about the grand mean. • To get the final amount of variance we run a regression with the covariates and compute T2. This value, reported above as 0.0633 [C], is the variance of studies about their predicted value. • If the initial T2 is 0.3088 and the remaining T2 is 0.0633, the difference (0.2455) is the T2 explained by the model. 147 • The program [A] then uses these values to compute R2, the proportion explained by the model, using either = R2 2 TExplained 0.2455 = = 0.7950 , 2 0.3088 TTotal (1.28) or equivalently, T2 0.0633 R2 = 1 − Re 2sdiual = 1− 0.7950 = 0.3088 TTotal The test that R2 is zero in the population is the same as the test of the model. For the model, Q = 18.845, df = 1, p <0.0001 [E], so we can conclude that R2 in the population is probably not zero. Notes While the logic of R2 is the same for primary studies and for meta-regression, the actual computation is different. (In primary studies the computation is based on the observed variance while in regression it is based on the true variance. In primary studies all observations are given the same weight, while in regression each study is given a different weight.) For this reason, when used with meta-analysis, the index is sometimes called the R2 analog rather than R2. In Figure 100, the normal curves are drawn to capture 95% of the dispersion in true effects, not in observed effects. The observed variance is assumed to include both within-study variance and between-study variance (T2). We are concerned only with the latter. The normal curves correspond to two standard deviations on either side of the regression line. By contrast, R2 is based the ratio of variances. Therefore, while the variance on the right is 79.5% smaller than the variance on the left, the normal curve is not 79.5% smaller (the ratio of the variances is not the same as the ratio of the standard deviations). Nevertheless, the correspondence is close enough for the purposes of this illustration. 148 THE SCHEMATIC FOR R2 The program features a schematic illustration of R2 (Figure 101). To navigate to this screen click More > R-squared graphic Figure 101 | Display R2 Figure 102 | Schematic for R2 In this figure, the bar represents the total variance in effects, T2, which we saw earlier Figure 99 [D] is 0.3088. Note that this is not the observed variance (which includes within-study variance and betweenstudy variance) but rather our estimate of the between-study (true) variance. The full bar reflects the true variance of all effects about the regression line when there are no covariates in the equation (that is, the true variance of all effects about the grand mean). The light-blue and dark-blue parts of the bar reflect the fact that the total variance can be decomposed into parts that can and cannot be explained by the model. 149 When we include the covariates in the model T2 is 0.0633. This is the variance of true effects about the regression line, or variance that cannot be explained by the model. This is represented by the light blue portion of the bar. If we look back at Figure 100, this is the ratio of the curves at right (squared) to the curves at left (squared). The dark blue portion of the bar represents the between-study variance that can be explained by the model, which is shown as 0.2455. We get this value by subtraction. If the unexplained variance with no covariates is 0.3088 and the unexplained variance with covariates is 0.0633, then the variance explained by the covariates is the difference, 0.2455. We define R2as the proportion of variance that is explained by the covariates. This is the proportion of the bar that is colored dark blue, in this case 0.79. We can get R2by taking the ratio of the dark blue (explained) to the total, using = R2 2 TExplained 0.2455 = = 0.7950 , 2 TTotal 0.3088 (1.29) Which is the formula displayed on the screen. Alternatively, we can take the ratio of the light blue (not explained) to the total and subtract this from 1.0, using T2 1 − Re 2sdiual R2 = TTotal 0.0633 1− 0.7950 . = = 0.3088 (1.30) A SEEMING ANOMALY In both primary studies and meta-analyses R2 is based on two estimates of the variance, i.e., with and without covariates. In primary studies both estimates are based on the same data, and so are linked. If one estimate is too low the other will be too low also, and the computation of R2 will be largely unaffected. For meta-analysis, by contrast, the situation is different. In a meta-analysis the two estimates of T2 are based on separate analyses, and it’s possible for one estimate to be low while the other is too high. A simplified version of the possible outcomes is shown in Table 4. Table 4 T2 with no covariates Underestimate T2 Overestimate T2 T2 with covariates Underestimate T2 Overestimate T2 2 2 R could be accurate (A) R will be too low (B) 2 2 R will be too high (C) R could be accurate (D) 150 Consider the situation in Cell (C), where we overestimate the initial variance and then underestimate the final variance. It seems we have explained more variance than we actually did. Conversely, consider the situation in Cell (B), where we underestimate the initial variance and then overestimate the final variance. It seems we have explained less variance than we actually did. While the error is easiest to see in cells B and C, there will also be error in cells A and D. Even if we underestimate both variances (Cell A) or underestimate both variances (Cell D), the magnitude of the errors will invariably differ, which will affect the estimate of R2. When the true value of R2 is large, these errors are not obvious. If the true value of R2 is .40 and we underestimate that value by .10, we simply assume that the correct value is 0.30. However, if the true value of R2 is near zero and we underestimate the value, then our estimate could fall below zero. For example, suppose the initial and final values of T2 are 0.20 and 0.19. Suppose further than the initial estimate is low and the second is high, so the observed values are 0.19 and 0.20. In this case it will appear that the unexplained variance has increased, which would mean that R2 is negative. It’s easiest to see this in Cell B, but this can also happen in cells A or D if one estimate has more error than the other, even if both are in the same direction. Since this must be due to sampling error (a proportion of variance cannot be negative) we simply set the value of R2 to zero. 151 ASSESSING CHANGE IN THE MODEL Suppose we run a model with only year as the covariate. Then we run a model with year plus latitude. Just as we can report statistics for either model as compared with the null model (the intercept only) we can also report statistics for the second model as compared with the first. Specifically, • • • We can test the Q-value for change in the model, to see if latitude adds any improvement in prediction over and above year. We can report the change in T2 with the change in the model. And, we can report the change in R2 with the change in the model. Here, we show how to assess change as we start with only the intercept, add Year, and then add Latitude to the prediction model. In Figure 103 we define a prediction model with the Intercept, Year, and Latitude. Figure 103 | Setup 152 Run the analysis • • • Click More results > Increments Select the statistical model tab (Random) Click More results > Increments The table of increments (Figure 105) collates the results of three analyses, as follows • • • Intercept only [D] Intercept plus year [E] Intercept, Year, and Latitude [F] At each iteration the program also displays the change in T2, the change in R2, and the statistical significance of the change. Figure 104 | Display increments 153 D E F G H Figure 105 | Increments To assess the statistical significance of the model vs. the null model (or the change in R2 vs. 0.0) we use the columns at the left (G) • T2 with no covariates is 0.3088, which serves as our baseline for computing R2. • When we use year as a covariate T2 drops to 0.2377, and R2 is computed as .2303. The test that R2 is zero is given by Q = 2.21, d f= 1, p = 0.1368. • When we use year and latitude as covariates T2 drops to 0.0921 and R2 is computed as .7018. The test that R2 is zero is given by Q = 14.30, df = 2, p = 0.0008. To assess the statistical significance of the model vs. the prior model (or the change in R2 vs. the prior R2) we use the columns at the right (H) • T2 with no covariates is 0.3088, which serves as our baseline for computing R2. • When we use year as a covariate, T2 drops to 0.2377 and R2 is computed as .2303. The change in R2 for the second line vs. the first (that is, adding year to the null model) is .2303. The test that the change is zero given by Q = 2.21, d f= 1, p = 0.1368. For this row, statistics for change versus 154 the prior model are identical to statistics for change versus the null model (since the prior model is the null model). • When we use year and latitude as covariates T2 drops to 0.0921 and R2 is computed as .7018. The change in R2 for the third line vs. the second (that is, adding latitude to year) is .4715. The test that the change is zero given by Q = 9.17, df = 1, p = 0.0025. 155 UNDERSTANDING I2 In Figure 106 the program shows results for a regression that included latitude as a covariate. The screen displays two estimates of I2, as follows • • For a regression with no covariates [A] covariates, I2 is 92.12% For a regression with covariates [B], I2 is 64.21% In any meta-analysis, the dispersion of observed effects can be partitioned into two parts. One is the dispersion of the true effects, and the other is dispersion due to sampling error. The I2 statistic gives us the ratio of true to total variance. Since we see the variance of the observed effects, but we care about the variance of the true effects, we can use I2 to serve as a link between the two. If we start with the variance of the observed effects and multiply this by I2, we get the variance of the true effects. Put simply, we get a sense of what the dispersion would look like if each study had a really large sample size (and therefore minimal error). If I2 is near 100%, then a plot of the true effects would look similar to the plot of the observed effects. As I2 moves toward 0%, more and more of the observed variance is simply sampling error, and would disappear if the studies were large enough (and we thus eliminated the sampling error). In a regression we report estimates for two distinct types of I2. On row A we report statistics for a regression with no covariates. On this line T2 is the variance of true effects about the regression line (which here is simply the grand mean). I2 tells us that proportion of the variance in observed effects about the regression line would remain if all studies had an extremely large sample size, so that essentially all error was removed. On row B we report statistics for a regression with covariates. On this line T2 is the variance of true effects about the regression line (which here is based on latitude). I2 tells us that proportion of the variance in observed effects about the regression line would remain if all studies had an extremely large sample size, so that essentially all error was removed. In both cases, the interpretation of I2 is the same. If we are presented with the variance of observed effects and we want to know the variance of the true effects, we multiply the former by I2 to get the latter. Note that if we multiply the observed variance by I2 the value we get is I2, which is the value presented on rows A and B. While T2 is the number we will actually employ in our computations (for example, to assign weights or to compute a prediction interval), I2 offers a way to get a visual sense of how the plot would change if we could somehow eliminate the error. 156 B A Figure 106 | Main results | Random-effects Summary If we start with a plot of the effects (either about the grand mean or about the regression line), some of the observed dispersion reflects differences in the true effects, while some reflects sampling error. If we’re interested in the actual utility of the intervention, then we care about the former, not the latter. I2 tells us what proportion of the variance reflects the former (T2) and what proportion reflects the latter (V). It also provides a mechanism that allows us to get a sense of what the plot would look like if it was based on the true effects rather than the observed effects. Specifically, if we construct a normal curve that captures most of the observed effects, we could multiple that curve’s height by a factor of I. This gives us the approximate distribution of the true effects. Critically, I2 is a proportion of variance, not an absolute variance. An I2 near 100% tells us that most of the observed variance is due to variation in true effect sizes, but it does not tell us that this variance is substantial. Conversely, a low value of I2 tells us that only a small proportion of the observed variance is due to variation in the true effect sizes, but does not tell us that this variance is trivial. 157 158 PART 9: WORKING WITH THE PLOT You can create a regression plot with one click and then modify it extensively. We will use the BCG analysis for this illustration. • • Create the model shown in Figure 107. Then, click Run regression Figure 107 | Setup 159 At this point the program displays a [Scatterplot] button on the toolbar • Click [Scatterplot] [A] A Figure 108 | Main results | Random-effects 160 The program displays the screen shown in Figure 109. Figure 109 | Plot of log risk ratio on Latitude | Random-effects The regression line is based on the regression equation • • • The variable on the X-axis varies Continuous variables are plotted at their means Categorical variables are plotted based on the proportion of studies in each category. For example, suppose that we’ve included “Hot” as a covariate, which is coded 0 for studies in cold climates and 1 for studies in Hot climates. Suppose further than 7/13 studies are coded 1. The mean score for Hot would be 0.54, and the regression would be plotted for studies where Hot is 0.54. The confidence interval and prediction interval are based on the uncertainty of the coefficient of the variable on the X-axis and the intercept—it does not depend on the uncertainty of the coefficients for the other variables. 161 To set the variable for the X-axis In Figure 110, • • Click [Graph by] on the tool bar [A] Or right-click on the variable name on the X-axis [B] A Figure 110 | Plot of log risk ratio on Latitude | Select variable for X-axis B 162 The main screen includes four elements – the studies, regression line, confidence interval, and prediction interval. Each can be set to show or hide independently of the others. For purposes of this tutorial, un-check the buttons as shown in Figure 111 [A]. • • • • Studies <Off> Regression line <Off> Confidence interval <Off> Prediction interval <Off> A Figure 111 | Plot of log risk ratio on Latitude | Blank canvas 163 Studies Click [Studies] to display/hide the individual studies as in Figure 112. Figure 112 | Plot of log risk ratio on Latitude | Studies The program displays each study as a circle • • • To set the circles to be proportionate to the study weight (or not) Click Format > Studies To edit the appearance of the circles Click Format > Studies To modify the color of the circles Click Color > Edit colors > Studies 164 Regression line Click [Regression line] to display/hide the regression line as in Figure 113. Figure 113 | Plot of log risk ratio on Latitude | Regression line The program displays the regression line. This reflects the predicted effect size (on the Y-axis) for any given value (on the X-axis). • • To edit the appearance of the regression line Click Format > Regression line To modify the color of the regression line Click Color > Edit colors > Regression line 165 CONFIDENCE INTERVAL AND PREDICTION INTERVAL The confidence interval and prediction interval are two very different indices. The confidence interval reflects the precision with which we estimate the mean value, while the prediction interval reflects the actual dispersion of effects about the mean value. The former is based on the standard error, and the latter on the standard deviation. For example, consider a simple meta-analysis (no covariates) where we report the mean effects size. The confidence interval is a measure of precision. If the mean effect is 0 .5 with a confidence interval of 0.4 to 0.6, this tells us that the mean effect in this population (the population of studies from which the sample was drawn) probably falls in the range of 0.4 to 0.6. The estimate will become more precise as the number of studies increases, and with an infinite number of studies the confidence interval will have a width that approaches zero. By contrast, the prediction interval is a measure of dispersion. It does not tell us about the mean effect but rather about the dispersion of effects about that mean. Suppose that the mean effect was 0.5 but the true effects ranged from 0.3 to 0.7. Suppose further that we had an infinite number of studies, and therefore knew the mean effect precisely. The confidence interval would be 0.5 to 0.5, but the prediction interval would be 0.3 to 0.7. In real life, of course, we don’t have an infinite number of studies and therefore we don’t know the mean precisely. If we estimate the mean as 0.5 and we estimate that 95% of studies will fall within 0.2 on either side of the mean, then the prediction interval will take into account the plus/minus 0.2 and add to that the uncertainty in the mean. This is not as simple as adding the width of the confidence interval to the width of the prediction interval (we actually work with the variances) but that’s the general idea. These same ideas apply to meta-regression as well. For any covariate we can estimate the coefficient, and then the confidence interval and the prediction interval for any value of the corresponding covariate. We can then display these on the graph. For example, suppose the regression line shows the relationship between latitude and effect size. We pick a latitude of X and find that the predicted effect is 0.50 with a confidence interval of 0.40 to 0.60 and a prediction interval of 0.30 to 0.70. This means that in the universe of studies from which we sampled • • The mean effect for a study at latitude X is probably in the range of 0.40 to 0.60. The effect size for any single study usually falls in the range of 0.30 to 0.70. Note that the prediction interval only makes sense when we apply random-effect weights. When we apply fixed-effect weights we assume that all studies at any given latitude have the same true effect size. By definition, the deviation of true effects about the predicted effect is zero. It follows that if the CI is 0.40 to 0.60, the PI will also be 0.40 to 0.60 166 When working with the confidence interval or the prediction interval we need to base the intervals on one-point or simultaneous computations. Click [Computational options] > [One point] or [Simultaneous] • • • • One-point – In 95% of analyses, the confidence interval at any single latitude will include the true mean effect for that latitude. Simultaneous – In 95% of analyses, the confidence interval at all latitudes will include the true mean effect for those latitudes. One-point – In 95% of analyses, the prediction interval at any single latitude will include the true effect for a study selected at random at that latitude. Simultaneous – In 95% of analyses, the prediction interval at all latitudes will include the true effect for a study selected at random at that latitude. These examples assume that the confidence level has been set to 95%. Finally, note that the CI and PI are relatively narrow at the mean of X and get wider as we depart from the mean. This is because any error in the coefficient gets multiplied as we depart from the mean of X in either direction. 167 Confidence interval Click [Confidence interval] to display/hide the confidence interval as in Figure 114. B C D Figure 114 | Plot of log risk ratio on Latitude | Confidence interval The Confidence interval is a measure of actual dispersion. It addresses the mean effect for any given latitude. In Figure 114, • • In our sample of studies the mean effect size for a study at any given latitude is indicated by the regression line [C]. In the universe from which we sampled, the mean effect size for a study at any given latitude probably falls in the confidence interval [B] to [D]. Click [Computational options] > [One point] or [Simultaneous] • One-point – In 95% of analyses, the confidence interval at any single latitude will include the true mean effect for that latitude. Simultaneous – In 95% of analyses, the confidence interval at all latitudes will include the true mean effect for those latitudes. These examples assume that the confidence level has been set to 95%. • • • • To set the confidence level (e.g., 90% or 95%) click [Computational options] To set the confidence interval to be based on Z or Knapp-Hartung click [Computational options] To edit the appearance of the confidence line Click Format > Confidence interval To modify the color of the confidence line Click Color > Edit colors > Confidence interval • • 168 Prediction interval Click [Prediction interval] to show/hide the prediction interval as in Figure 115. A B C D E Figure 115 | Plot of log risk ratio on Latitude | Prediction interval Where the confidence interval is an index of precision, the prediction interval is an index of dispersion. In Figure 115, • • • In our sample of studies the mean effect size for a study at any given latitude is indicated by the regression line [C]. In the universe from which we sampled, the mean effect size for a study at any given latitude probably falls in the confidence interval [B] to [D]. In the universe from which we sampled, the true effect size for a single study at any given latitude probably falls in the prediction interval [A] to [E]. Click [Computational options] > [One point] or [Simultaneous] • One-point – In 95% of analyses, the prediction interval at any single latitude will include the true effect for a study selected at random at that latitude. Simultaneous – In 95% of analyses, the prediction interval at all latitudes will include the true effect for a study selected at random at that latitude. These examples assume that the confidence level has been set to 95%. • • • • To set the confidence level (e.g., 90% or 95%) click [Computational options] To set the prediction interval to be based on Z or Knapp-Hartung click [Computational options] To edit the appearance of the confidence line Click Format > Prediction interval To modify the color of the confidence line Click Color > Edit colors > Prediction interval • • 169 To identify specific studies The program allows you to identify any study in the plot. In Figure 116, • • • Click [Identify study] [A] Click on any study [B] The program displays the study name [C] A B C Figure 116 | Plot of log risk ratio on Latitude | Identify studies 170 Other options for customizing the graph are as follows Appearance Line width Font Font size Format > Line width Font Format > Font size Title and labels Title X-Axis Y-Axis Labels > Title Labels > X-axis Labels > Y-axis Study circles Proportionate Line width Format > Studies Format > Studies Scale for X-axis Scale for Y-axis Decimals Format > X-axis Format > Y-axis Format > Decimals Axes Statistical Model Fixed Random Select Fixed tab at bottom of screen Select Random tab at bottom of screen Predictive model Model 1 Select desired model at bottom of screen Export To Word To PowerPoint To File To Clipboard Files > Export to Word Files > Export to PowerPoint Files > Export to File Files > Copy to clipboard Comments Equation Annotation Comment 1 Comment 2 Show / Hide / Edit Show / Hide / Edit Show / Hide / Edit Show / Hide / Edit (The prediction equation) (For the confidence interval and prediction interval) (User’s optional comment) (User’s optional comment) Decimals Equation in plot X-Axis Y-Axis Select number Select number Select number 171 Modify the colors The program maintains two color schemes. These are called Printing and PowerPoint but can actually be used for any purpose. To switch between the schemes click • • Color > Use colors for printing Color > Use PowerPoint After you’ve selected one scheme or the other, you can edit the color for any element on the screen. To modify colors for the current color scheme click • Color > Edit colors (for current scheme) 172 Categorical variables The plot for categorical variables works the same way as for continuous variables. In Figure 117 we will plot by Allocation. This is a categorical variable that reflects the mechanism by which patients were assigned to either vaccine or placebo. The group names, corresponding to the type of allocation, are “Randomized”, “Systematic”, and “Alternate”. In Figure 117 we define the prediction model, then click [Run regression]. Figure 117 | Regression | Setup 173 Figure 118 shows the main results. Figure 118 | Regression | Main results | Random-effects 174 In Figure 119, • Click [Scatterplot] [A] • Click Graph by > Allocation [B] B A Figure 119 | Regression | Plot | Categorical covariate The plot shows one column for every category (random, alternate, and systematic). The options to show or hide the studies, regression line, confidence interval, and prediction interval are the same as they were for continuous covariates. While there are only two dummy variables (Alternate and Systematic) the program automatically adds a column for the reference category (Randomized). 175 Setting the scales The program will automatically set the scale for the X-axis and Y-axis. This works well for any single plot. However, but if you want to create a series of plots and ensure that these all employ with the same scale you’ll need to set the Y-axis. Otherwise, the Y-axis may differ from one plot to the next, making it difficult to compare plots. To set the Y-axis manually click Format > Y-axis. Figure 120 | Regression | Plot | Setting the scale anchors 176 PART 10: COMPUTATIONAL OPTIONS The program allows you to set various computational options On the regression screen click Computational Options on the menu. Figure 121 | Regression | Set statistical options 177 KNAPP-HARTUNG VS. Z In primary studies when we perform a significance test we have the option to use either the Z -test or the t-test. We use the Z -test when the population variance is known, and we use the t-test when we are using the sample variance to estimate the population variance. The choice of a test (t vs. Z) affects the p-value in two ways. • • First, the estimate of the standard error is greater with t than with z, and so the test statistic is smaller. Second, when we use the t-test the value required for statistical significance is larger than it is with Z. The difference between the two tests is most pronounced when the sample size is small. Once the sample size passes thirty the difference between t and Z is minor, and at one-hundred the difference is trivial. While the choice between t and Z applies to cases where we compare two groups, the same idea applies to cases where we compare more than two groups. Here, the choice is between the F statistic (when the variance is estimated) and chi-squared (when the variance is known). The choices are shown in Table 5. Table 5 – Test statistics in primary studies Two groups More than two groups Variance estimated t F Variance known Z Χ2 We are faced with a similar situation in meta-analysis. Since the variances are often being estimated from the observed data, it would make sense to use the t distribution to test the null hypothesis and to construct confidence intervals. In fact though, researchers have traditionally used the Z distribution for these purposes. In the case of a fixed-effect model this distinction turns out to have little practical impact. The only source of error is the variance within studies, and since the n within studies (accumulated across studies) is typically well over thirty, the difference between t and Z is negligible. Therefore, the practice of using z has not been challenged. However, in the case of a random-effects model, the situation is more complicated. Recall that the error component incorporates two distinct elements – the within-study error and the between-study error. We can justify using Z for the within-study error for the same reason that we justify that approach for the fixed-effect model. However, the between-study variance is based on the number of studies, which is typically small, and the difference between t and Z for this component of the variance is typically substantial. 178 The solution proposed by Knapp and Hartung is to address each component of the variance separately. Specifically, we would use the Z (or chi-squared) distribution for the within-study variance and the t (or F) distribution for the between-study variance. • • • When we are estimating the mean effect size in one set of studies this approach would apply to the test of the null. In a subgroups analysis this would apply to the test that compares the subgroup means. In a meta-regression it would apply to the test of each covariate and to the test of the model. Note. The program allows you to select either option from the statistics menu. When you select [Z-Distribution] the program uses Z and Q (Figure 123 and Figure 124). When you select [Knapp-Hartung] the program uses t and F (Figure 125 and Figure 126). 179 Figure 122 shows a prediction model using Allocation, Year, and Latitude as covariates. Figure 122 | Regression | Setup 180 Click Computation options > Z-distribution (Figure 123) Figure 123 | Set statistical options | Z-Distribution vs. Knapp-Hartung 181 Figure 124 shows the main-results screen with this option in effect. A B C D E F G Figure 124 | Main results | Z-Distribution The screen’s title shows that the Z -distribution is being employed • • • • • • • [A] The standard errors are based on Z [B] The confidence intervals are based on Z [C] The test statistics and p-value for individual covariates are based on Z [D] The test statistic and p-value for the set is based on Q [E] The test statistic and p-value for the model are based on Q [F] The test statistic and p-value for Goodness of fit are based on Q [G] The test statistic and p-value for the model with only the intercept are based on Q 182 To select Knapp-Hartung click Computation options > Knapp-Hartung (Figure 125). Figure 125 | Set statistical options | Z-Distribution vs. Knapp-Hartung 183 Figure 126 shows the main results with this option in effect. A B C D E F G Figure 126 | Main results | Knapp-Hartung The screen’s title shows that Knapp-Hartung (KH) is being employed • • • • • • • [A] The standard errors are based on t [B] The confidence intervals are based on t [C] The test statistics and p-values for individual covariates are based on t [D] The test statistic and p-value for the set are based on F [E] The test statistic and p-value for the model are based on F [F] The test statistic and p-value for Goodness of fit are based on Q, and not F. This is because this tests the null that T2 is zero. Since the KH adjustment only affects the T2 part of the variance, when T2 is zero the adjustment is not applied. [G] Statistics for the model with only the intercept are based on Q, and not F. This is because this tests the null that T2 is zero. Since the KH adjustment only affects the T2 part of the variance, when T2 is zero the adjustment is not applied. In addition to using the t-distribution or F-distribution for the critical values, the standard error is adjusted for all of these. 184 Compare Figure 124 which is based on Z, with Figure 126 which is based on Knapp-Hartung. Table of coefficients When we move from a Z-score to Knapp Hartung • • • • • • • The coefficients do not change The standard error increases [A] The confidence interval width increases [B] The Z-score is replaced by a (smaller) t-score [C] The p-value becomes larger (less significant) [C] The Q-value for a set is replaced by a (smaller) F-score [D] The p-value for a set becomes less significant [D] Test of the model When we move from a Z-score to Knapp Hartung • • The Q-value is replaced by a (smaller) F-value The p-value becomes less significant Goodness of fit When we move from a Z-score to Knapp Hartung [F] • The numbers do not change. This is because the Knapp-Hartung adjustment only applies to the T2 part of the variance, but the goodness of fit test is computed assuming T2is zero. Comparison of Model 1 with the null model When we move from a Z-score to Knapp Hartung [G] • The numbers do not change. This is because this comparison employs weights based on withinstudy variance (V), and the Knapp Hartung adjustment only affects between-study variance (T2). Notes While it is always true that the p-value will be the same or higher (further from zero) for Knapp-Hartung (KH), the extent of the difference depends on the amount of between-study variance and the number of studies. To the extent that the between-study population variance is small and/or the number of studies is large, the between-study error variance will be small, and the difference between the Z option and the KH option will tend to be relatively small. Conversely, to the extent that the between-study population variance is large and/or the number of studies is low, the difference between the two options will tend to be relatively large. 185 You do not need to return to the [Modify models] screen to switch between Z and Knapp-Hartung. Rather, if you’re already looking at the results you can simply change the setting and the results will change. Figure 124 and Figure 126 showed the impact of this option for the main screen, but the impact actually affects all screens that display confidence intervals and/or tests of significance. The option also affects the plots, since the confidence interval and prediction interval depend on the standard error and the statistical distribution. Since the Knapp-Hartung option is intended to address uncertainty in between-studies variance, most people who use it do so only for the random-effects model. In CMA, the Knapp-Hartung option is only available for random-effects models. While these adjustments can be applied to any use of the random-effects model (that is, for a single group of studies, for a subgroup analysis, and for meta-regression), to date we have only implemented them for the meta-regression. We plan to update the other modules in the future. The intent of the Knapp-Hartung adjustment is to improve the accuracy of p-values, confidence intervals, and prediction intervals. Higgins and Thompson (2004) proposed an approach that bypasses the sampling distributions and instead employs a permutation test to yield a p-value. Using this approach we would compute the Z-score corresponding to the observed covariate. Then, we would randomly redistribute the covariates among studies and see what proportion of these re-distributions yield a Z-score exceeding the one that we had obtained. This proportion may be viewed as an exact pvalue. This option is not implemented in CMA. 186 ONE-POINT OR SIMULTANEOUS CONFIDENCE INTERVALS FOR GRAPH When you plot the regression line of effect size on a covariate in a meta-regression, the program allows you to plot the confidence interval. The confidence interval reflects the uncertainty in the predicted value (the height of the regression line) being plotted. For example, if we plot the regression line for effect size on latitude, the confidence interval reflects the uncertainty in the predicted value of effect size for each value of latitude, but it treats all other covariates as fixed at their mean. There are two useful options for plotting the interval. We can plot an interval that is accurate for any single point on the graph, or an interval that is accurate for all points on the graph simultaneously. To select either option Click Computation options > Simultaneous/One-point • Accurate for one point means that if we were to select any one point on the regression line at random, in 95% of all possible regressions, the true predicted value for that point would fall within the confidence interval displayed at that point. • Accurate for all points means that if we were to look at all points on the regression line, in 95% of all possible regressions, the true predicted value for all the points would fall within the confidence interval displayed at that point. Note that this includes predictions for all possible values of latitude, not only those that happen to appear in the data set. Obviously the second criterion is stricter (we want to make an inference about the predicted value for all values of latitude rather than one) and therefore, the confidence interval will need to be wider. We do this by using a multiplier based on the Sheffé adjustment. The difference between the two can be seen by comparing Figure 127 (one-point) with Figure 128 (simultaneous). The regression line is identical in the two, but the confidence interval is wider in the second. This is true for all points on the plot, but is most evident toward either end of the regression line, since uncertainty in the coefficient becomes more evident as we depart from the mean of the predictor. For example, compare the width of the one-point confidence interval in Figure 127 [A] versus the simultaneous confidence interval in Figure 128 [B]. To facilitate this comparison we used Format > Y-Axis to set the same scale for both plots. Click Comments > Show annotation to include the details on the plot, as shown in the bottom right-hand corner of each figure. 187 A Figure 127 | Set statistical options | One-point confidence intervals B Figure 128 | Set statistical options | Simultaneous confidence intervals 188 OPTIONS FOR ESTIMATING Τ2 (MM, ML, REML) When we select random-effects the program needs to estimate the value of tau-squared (τ2), the true between-studies variance. (We use the Greek symbol τ2 to represent the true value, and T2 to represent the sample estimate of that value). To select a method click [Computational options] as shown in Figure 129. Figure 129 | Set statistical options | Estimating T2 There are three approaches commonly used to partition the variance and estimate τ2. These are • • • Method of moments (MM), also known as the DerSimonian and Laird method) Unrestricted maximum likelihood (ML) also known as maximum likelihood Restricted maximum likelihood (REML) Each of these methods has advantages and disadvantages. If we are not willing to assume that the effect sizes are normally distributed, MM is often the method of choice. The method of moments does not depend on any assumptions about the distribution of the random effects, so it has a robustness characteristic that the two other methods (which involve the assumption that the random effects have a normal distribution) do not have. If we are willing to assume a normal distribution of effects, then statisticians tend to prefer ML or REML, which are more efficient than MM (the estimates have smaller variance). Between MM and REML, ML tends to yield a more precise estimate of T2 (but with a bias) while REML tends to yield a less biased estimate (but with less precision). With small numbers of studies imprecision can be more important than bias, and so some prefer ML. With more studies, the balance may shift in favor of REML. 189 Note. If you have already run the analysis and want to modify the statistical option, you do not need to return to [Modify models] and re-run the analysis. Simply click [Computational options] and make a selection. All three options for estimating τ2 can be used for a basic analysis as well as for regression. However, the basic analysis module in CMA offers the MM option only. (We plan to add other options in the future). 190 ONE-SIDED VS. TWO-SIDED TESTS Many statistical tests allow the option of one-sided or two-sided. • • Two-sided tests are appropriate when an effect in either direction would be meaningful. One-sided tests are appropriate when we only need to identify an effect in one direction, and an effect in the other direction would have the same implications as zero effect. In the overwhelming majority of social-science and medical research, while we may (indeed almost always do) expect the effect to fall in a specific direction, an effect that was statistically significant in the other direction would still be important. For example, if we expected the treatment to improve survival but it turned out to hurt survival, this would be critically important information. However, if the test had been performed as one-tailed then an effect in the reverse direction (that the treatment is harmful) cannot be statistically significant by definition, even if the computed p-value is < 0.0001. Therefore, except in rare instances, the two tailed test is appropriate. In the event that you select a one-tailed test, note that this applies only to the p-values for individual covariates on the main results screen. • • • • • It does not affect the confidence interval since this is displayed for lower and upper limits. It does not affect the p-value for the test of the model. Since this is based on Q (or F) no direction can be specified and it must be two-tailed. It does not affect the p-value for a set of covariates. Since this is based on Q (or F) no direction can be specified and it must be two-tailed. (For consistency, this applies even if the set incudes only one covariate.) It does not affect the p-value for a test of the increment. Since this is based on Q (or F) no direction can be specified and it must be two-tailed. It does not affect the confidence interval nor the prediction interval on the plot. Since these are shown for both the lower and upper limit, they are displayed using multipliers for a two-tailed test. 191 PART 11: CATEGORICAL COVARIATES Categorical covariates are covariates that represent a category or group, rather than a numerical score. For example, the covariate “Allocation” reflects the mechanism employed to assign patients to either vaccine or placebo. Each study is coded as “Randomized”, “Systematic”, or “Alternate”. When we perform a subgroups analysis (as with an analysis of variance in a primary study) we can work directly with categorical covariate and classify each study by its allocation method (e.g., “Systematic”). However, this not possible when we perform a regression, since regression requires that we work with numbers, not labels. Therefore, rather than working with the original variable we create so-called “dummy variables”, numeric variables that stand for a group or category. In this chapter we will discuss how to create and interpret these dummy variables. As always, we assume that the reader is familiar with the use of dummy variables in primary regression, and our intent is to show how the same rules apply for meta-regression. Note. The mechanism for working with dummy variables in a regression depends on whether or not we include the intercept in the regression equation. In this chapter we assume that we will include the intercept. The alternative is discussed in [Part 12: When does it make sense to omit the intercept]. Overview For a categorical variable with m groups, we need to create m − 1 dummy variables. Since Allocation has three groups (Randomized, Systematic, Alternate) we need to create two dummy variables. We need to select any one of the three groups to serve as the “Reference” group. Then, we create a dummy variable for each of the other two groups (but not for the reference group). With three groups, we have three options − A. We can select Randomized to serve as the reference group. The dummy variables will be Systematic and Alternate. B. We can select Systematic to serve as the reference group. The dummy variables will be Randomized and Alternate. C. We can select Alternate to serve as the reference group. The dummy variables will be Systematic and Randomized. In this chapter we discuss − • • • How to create the dummy variables How to use these in the regression How to select the reference group 192 Dummy variables CMA is able to create the dummy variables automatically. We can use the Allocation variable as an example. In Figure 130, • • • • A Click on Show Covariates [A] Click on Allocation [B] Click on Edit reference group and select [Random] [C] Click [Add to main screen] [D] B C D Figure 130 | Creating dummy variables 193 Since we’ve set “Random” as the reference group, the Dummy variables are “Alternate” and “Systematic”. In Figure 131 [E] the program creates these and adds them to the variable list. E Figure 131 | Creating dummy variables F As always, tick the boxes [F] to include these variables in the current predictive model. Tick either of the two boxes, and the other will be ticked automatically. This is because the two represent allocation, and not because they belong to the same set. The two dummy variables are Alternate and Systematic. Following the conventions proposed by Cohen, studies are coded “1” if they belong to the dummy-variable’s group name. Thus, • • A study is coded 1 for “Alternate” if it employed alternate allocation, or 0 otherwise. A study is coded 1 for “Systematic” if it employed systematic allocation, or 0 otherwise. Therefore, the first three studies, shown as Figure 132 [G], should be coded as follows. • • • Frimodt-Moller et al (Alternate) should be coded 1 for alternate and 0 for systematic TB Prevention trial (Random) should be coded 0 for alternate and 0 for systematic Comstock et al 1974 (Systematic) should be coded 0 for alternate and 1 for systematic 194 G Figure 132 | Categorical variables If you’d like to see the actual codes assigned by the program, proceed as follows. • • Click Run Regression Click More results > All studies [H] H I Figure 133 | Creating dummy variables Figure 133 [I] shows that the codes have been assigned as expected. • • • Frimodt-Moller et al (Alternate) is coded 1 for alternate and 0 for systematic TB Prevention trial (Random) is coded 0 for alternate and 0 for systematic Comstock et al 1974 (Systematic) is coded 0 for alternate and 1 for systematic 195 The selection of a reference group has no impact on the model To this point we’ve introduced the idea of a reference group, and shown that the selection of a reference group determines which dummy variables will be created. Critically, while the selection of a reference group can be important for some aspects of the analysis (as discussed below), it has no impact on the statistics for the model. To emphasize this point, we present three versions of the regression. • • • In Figure 134 the reference group is “Random”, the dummy variables [A] are Alternate and Systematic In Figure 135 the reference group is “Systematic”, the dummy variables [A] are Alternate and Random In Figure 136 the reference group is “Alternate”, the dummy variables [A] are Random and Systematic 196 A D B C Figure 134 | Dummy variables | Allocation with “Randomized” as the reference group 197 A D B C Figure 135 | Dummy variables | Allocation with “Systematic” as the reference group 198 A D B C Figure 136 | Dummy variables | Allocation with “Alternate” as the reference group 199 In all three versions of the analysis, statistics for the model are exactly the same • • • • • The test of the model [B] yields a Q-value of 1.4349 with 2 df and p = 0.4880. The goodness of fit test [B] yields a Q-value of 132.3676 with 10 df and p < 0.0001. The estimates of T2 and T are 0.5596 and 0.7480 [B], respectively. The estimate of I2 with covariates in the model is 92.45% [B] The estimate of R2 is 0.00% [C] The reason that the statistics for the model are the same regardless of which group serves as the reference group (and which two dummy variables are included in the regression) is that all three versions incorporate precisely the same information. Concretely, once we know a study’s code on any two dummy variables, we know precisely which allocation method was employed for that study. It follows that (at least for purposes of the model) it doesn’t matter which two dummy variables we used. Indeed, it must be this way. The set of dummy variables addresses the question “Is allocation related to effect size” and it must be true that we will get the same answer regardless of which mechanism we employ to represent allocation in the regression. Working with the “Set” When the program creates a series of dummy variables to represent a categorical covariate, it automatically defines dummy variables these as a “Set”. In Figure 134, Figure 135, and Figure 136 the program has added a column labeled “Set”. • • • In this column we see the label “Allocation”, which refers to the categorical variable. Brackets indicate the two dummy variables that represent allocation. Each of these variables has a two-part name, with the first part reflecting the core variable (Allocation) and the second part indicating the dummy-variable (Alternate). In our example (where there are three groups) the set includes two covariates [A]. • • • The line for “Alternate” (if it exists) addresses the impact of Alternate allocation vs other types of allocation. The line for “Systematic” (if it exists) addresses the impact of Systematic allocation vs other types of allocation. The line for “Random” (if it exists) addresses the impact of Randomized allocation vs other types of allocation. Thus, each of these lines addresses the impact of a specific allocation type. By contrast, the “Set” addresses the impact of Allocation in general. This is an omnibus test that asks if there are any differences in effect size among allocation types. These statistics are shown at the righthand side of the display [D]. 200 In our example Allocation (in the form of dummy variables) is the only covariate in the equation, and so the test of the set is identical to the test of the model. • • The test of the set [D] yields Q-value of 1.4349 with 2 df and p=0.4880. Similarly, the test of the model [B] yields a Q-value of 1.4349 with 2 df and p=0.4880. Therefore, in this example we really didn’t need to present statistics for the set. We could have simply relied on the statistics for the model. However, this equality only exists when the variables in the set are the only variables in the model. Typically, this is not the case. Rather, there will often be additional covariates in the model, and when that’s true, the test of the set is quite different from the test of the model. For example, suppose the model includes Allocation (dummy coded) and also latitude. The Set would test the impact of allocation with latitude partialled. By contrast, the model would test the combined (and simultaneous) impact of allocation and latitude. These are two entirely different issues. How to select a reference group To this point we’ve shown that the selecting one group rather than another to serve as the reference group has no impact on the statistics for the full model nor the statistics for the set. Nevertheless, the selection of a reference group does have implications what lines are displayed within the set, and how these lines are interpreted. Consider the case where we use Random allocation as the reference group (Figure 134). In this case the dummy variables are “Alternate” and “Systematic”. The predicted effect size for a study is going to be Y= B0 + B1 ( Alternate) + B2 ( Systematic) (1.31) Studies that employed Random allocation will have a code of 0 on both covariates, and so the predicted value is or simply Y= B0 + B1 (0) + B2 (0) , (1.32) Y = B0 (1.33) In other words, for studies in the reference group, the predicted value is simply the intercept. For a study in either of the other groups, the predicted value is the intercept plus the coefficient for that group. Thus, the coefficient for Alternate gives us the difference between the predicted effect size in that group and the predicted effect size in the reference group. 201 If we work with Figure 134, where the reference group is Random allocation, • • • Random allocation – predicted effect size is the intercept (−0.9905), or −0.9905 Alternate allocation – predicted effect size is 0.4780 units above the intercept, or −0.5125 Systematic allocation – predicted effect size is 0.5822 units above the intercept, or −0.4083 If we work with Figure 135, where the reference group is systematic allocation • • • Systematic allocation – predicted effect size is the intercept (−0.4083), or −0.4083 Alternate allocation – predicted effect size is 0.1042 units below the intercept, or −0.5125 Random allocation – predicted effect size is 0.5822 units below the intercept, or −0.9905 If we work with Figure 136, where the reference group is alternate allocation • • • Alternate allocation – predicted effect size is the intercept (−0.5125), or −0.5125 Random allocation – predicted effect size is 0.4780 units below the intercept, or −0.9905 Systematic allocation – predicted effect size is 0.1042 units above the intercept, or −0.4083 Thus, the predicted value for each group is the same regardless of which group serves as the reference group. The difference is that each version presents a different set of comparisons. Note. The standard error for the reference group is the standard error of that group’s mean effect. By contrast, the standard error for the other groups is the standard error of the difference between that group and the reference group. Similarly, the p-value for the reference group tests the null that the mean effect size in the reference group is zero. By contrast, the p-value for the other groups tests the null that the difference between that group’s mean effect and the reference group’s mean effect is zero. 202 Finally, to show the correspondence between this analysis and a subgroups analysis, we show a subgroups analysis ( Figure 137) where we’ve grouped by allocation type. A Figure 137 | Subgroups | Allocation type • • B The mean effect size for each subgroup [A] is the same as the numbers from the regression The Q-value for the model [B] is the same as the numbers from the regression In this example, Allocation (or rather the dummy variables that represent allocation) is the only covariate in the regression. Therefore, the intercept is simply the mean effect for the reference group and the coefficients represent the difference in mean effects. If the regression model included other covariates then all the statistics would be adjusted for the other covariates. Note. For the subgroups analysis, Computational options > Random and mixed-effect options must be set to pool estimates of T2. Creating dummy variables manually Above, we showed how to create dummy variables automatically. You also have the option of creating dummy variables manually. On the data-entry sheet create a column for a moderator and then (critically) define the moderator type as integer or decimal. Then, enter a value for each study. While the option to create dummy variables automatically works well in most cases, there are several cases where you’ll need to use the manual option. Interactions 203 Suppose you have a categorical variable that is represented by a dummy variable A, and you want to assess the impact of that variable and also its interaction with another variable B. You’ll need to work with the variables A, B, and AB. In this case it will be easier to create A (and then AB) manually. Alternate coding schemes Dummy-coding is only one of the options possible for categorical covariates. Texts on multiple regression discuss other options, such as effects-coding and contrast-coding. You can use any of these coding schemes, but you’ll need to create the dummy variables manually. Regressions with no intercept The automatic coding scheme is only available when you include the intercept in the equation. When you omit the intercept (see next chapter) the coding scheme changes (in that case, for m groups you need m rather than m – 1 dummy variables) and the automatic function is not available. If you create the dummy variables manually, you’ll also need to define these as a set manually. This is discussed in Part 13: Working with “Sets” of covariates. 204 PART 12: WHEN DOES IT MAKE SENSE TO OMIT THE INTERCEPT In any primary regression or meta-regression we have the option to either include or exclude the intercept from the prediction equation. The decision to omit the intercept usually arises when we are working with a categorical variable, and we will discuss the issue in this context. This decision fundamentally affects the issues addressed by the analysis, as follows. When we include the intercept in a regression with categorical covariates • • • Coefficients reflect differences in effect size across categories. Tests of a covariate address the question “Is the covariate related to effect size?” The model tests the null hypothesis that no covariate is related to effect size. When we omit the intercept • • • Coefficients reflect the absolute effect sizes within categories. Tests of a covariate address the question “Is the effect size zero?” in this category. The model tests the null hypothesis that all groups have a mean of zero. In all chapters up to this point we have assumed that we are including the intercept, which is typically the case. However, there are cases when we may want to omit the intercept. • First, we will address a technical issue about coding categorical variables. As explained below, the decision to include or omit the intercept impacts the way that we code categorical variables for the analysis. • Second, we will show how to interpret an analysis where the intercept is omitted, and how this differs from an analysis where the intercept is included. As always, we caution the reader that this chapter is not a comprehensive treatment of the topic. We assume that the reader is familiar with these issues from multiple regression in primary studies. Our goal here is to review the key concepts, and show how they can be applied in meta-analysis. The example The initial data set includes a covariate called latitude. For purposes of this example we need a categorical covariate, so we will create a new variable called Climate, which is coded Hot or Cold (latitude 33 and under, vs. 34 and higher). Since Climate is categorical, it cannot be inserted directly into the analysis. Rather, we need to create a numerical covariate corresponding to climate and use this in the analysis. We may use any of several schemes for this purpose (dummy coding, effects coding, contrast coding) but what all of these schemes have in common in the usual approach (including an intercept) is that we need m − 1 covariates to represent a covariate with m groups. In the current example there are two groups (Cold and Hot) so we need one covariate. 205 If we elect to use dummy coding, we can create a covariate called Hot and code it 0 for Cold studies and 1 for Hot studies. Alternatively we can create a covariate called Cold and code it 1 for Cold studies and 0 for Hot studies. We could use either of these covariates in the analysis but we could not use both since one of them will be redundant. This rule, that we need m − 1 covariates for a categorical variable with m groups, only applies when we include the intercept in the regression. By contrast, when we omit the intercept we actually need m covariates. In the current example we would include two covariates (Hot and Cold) in the analysis. CMA includes a mechanism to create dummy variables automatically, but this mechanism (since it creates m − 1 covariates) is intended only for cases where we include the intercept. For simplicity in this chapter we will create all dummy variables manually, whether or not we include the intercept in any given analysis. Figure 138 shows the two dummy variables • • “Hot” is coded 1 if the study was located in a hot climate and 0 otherwise [A] “Cold” is coded 1 if the study was located in a cold climate, and 0 otherwise [B] Thus, a “1” indicates the presence of the attribute (Hot or Cold) while a “0” indicates the absence of that attribute. A B Figure 138 | Data-entry | Dummy variables for Hot and Cold If we include the intercept we will include either Hot or Cold in the prediction equation. If we omit the intercept we will include both Hot and Cold in the prediction equation. Note. This becomes more complicated when we have two or more categorical variables, such as Climate and Allocation. This is beyond the scope of this manual. 206 Interpreting the results A meta-regression without the intercept will give us the mean effect for each group (Hot and Cold). Before proceeding to the regression, we can use a subgroups analysis to see what these means actually are. On the main analysis screen click Computational options > Mixed and random effects options and then select the option to “pool within-group estimates of tau-squared” as shown in Figure 139. Then click Computational options > Group by and group by Climate. Select the “Random” tab at the bottom of the screen. Figure 139 | Basic analysis | Computing T2 in the presence of subgroups Figure 140 | Basic analysis | Subgroups Cold vs. Hot 207 C D E Figure 141 | Basic analysis | Subgroups Cold vs. Hot Figure 140 and Figure 141 show the results of this analysis • For the Cold subgroup [C] the mean effect size is −1.1987 with SE = 0.1769. The test addresses the question “Is this effect size zero?” and yields Z = -6.7740, p < 0.0001. • For the Hot subgroup [D] the mean effect size is −0.2784 with SE = 0.1522. The test addresses the question “Is this effect size zero?” and yields Z = −1.8289, p < 0.0674. • The test of the between-subgroups variance [E] addresses the question “Does the mean effect size differ by subgroup?” and yields Q = 15.5445, df = 1, p = 0.0001. To compute the mean for each subgroup in meta-regression we would omit the intercept and include both Cold and Hot as covariates (note that the box for intercept is unchecked). Figure 142 | Regression | Setup | No intercept 208 CC DD Figure 143 | Regression | Main results | No intercept Because we have omitted the intercept, the statistics reflect the mean effect size for each group. • For the Cold subgroup [CC] the mean effect size is −1.1987 with SE = 0.1769. The test addresses the question “Is this effect size zero?” and yields Z = -6.7740, p < 0.0001. • For the Hot subgroup [DD] the mean effect size is −0.2784 with SE = 0.1522. The test addresses the question “Is this effect size zero?” and yields Z = −1.8289, p < 0.0674. Note that these numbers are exactly the same as the numbers we saw in the subgroups analysis, Figure 141). Line CC corresponds to line C, and line DD corresponds to line DD. Finally, in Figure 143, since there is no intercept, the test of the model addresses the null hypothesis that the mean for all groups is zero. The Q-value of 49.2323 with 2 degrees of freedom yields a p-value 209 of < 0.0001. We reject the null and conclude that the mean effect size is probably not zero in at least one of the groups. (The subgroups analysis does not include this test). 210 PART 13: WORKING WITH “SETS” OF COVARIATES DEFINING A “SET” In regression there are times when we use several covariates to capture a concept. For example • • • • • If we have a categorical covariate with m values, we use m – 1 covariates to represent this variable in the analysis (as discussed in the preceding chapter). If we want to assess the relationship between duration of treatment and effect we might include duration, duration2, and duration3 as predictors. We may have a series of covariates, such as income and education that (together) represent the impact of socio-economic status. We may have a series of covariates such as dose and duration that (together) represent the intensity of a treatment. We may have two covariates and also the interaction between, where the three together represent their influence on outcome. When we define covariates as a Set, the program reports a test of significance for the Set with all other covariates held constant. For example, consider the analysis displayed in Figure 144. The two covariates dummy variables that (as a set) capture the Allocation method. Note that in Figure 137 there is now a column labeled “Set” [C]. Under this column there is a set called “Allocation” and the program has inserted brackets to show which covariates are included in the set. 211 C A B Figure 144 | Regression | Main results | Assessing the impact of a set • • • The statistics for Allocation: Alternate tells us if the use of alternate allocation is related to effect size (when all other covariates are partialled). The statistics for Allocation: Systematic tells us if the use of systematic allocation is related to effect size (when all other covariates are partialled). The statistics for the set [A] tell us if Allocation as a whole (that is, the use of Random, Systematic, or Alternate allocation) is related to effect size. When the covariates in the set are the only covariates in the prediction equation (as they are in this example) the statistics for the set will be identical to the statistics for the model. Specifically, the Qvalue for the set [A] and the Q-value for the model [B] are both 1.4349. Therefore, in this example (Figure 144) we could have simply employed the test of the model as the test of Allocation. However, that is not the case when the model includes additional covariates. Consider Figure 145, where the model includes Latitude and Year as well as the two dummy-variables for Allocation. In this case, if we want to know the impact of Allocation with Latitude and Year partialled, we need to use the statistics for the Allocation set, Q = 1.5492, df = 2, p = 0.4609 [A]. We could not use the statistics for the model [B] to test the impact of allocation, since the test of the model is based on the impact of all four covariates. 212 A B Figure 145 | Main results | Assessing the impact of a set 213 HOW TO CREATE A SET When we use a Set to represent a categorical variable, CMA creates the dummy variables automatically and links them (with the bracket) automatically. In all other cases we need to link the covariates manually. For this example we’ll work with Latitude-C and Latitude-C2 (latitude centered, and its square). We want to create a set that incorporates these two covariates and call the set “Latitude Set”. In Figure 146 • • • • Move Latitude-C into the model Move Latitude-C2 into the model Ensure that the covariates intended for the set are sequential in the list Highlight these covariates by pressing {SHIFT} and clicking on the covariate names [B] B Figure 146 | Setup | Defining a set of covariates 214 In Figure 147 • • Click [Link Covariates] [C] Enter the name Latitude Set and click [Ok] [D] C D Figure 147 | Setup | Naming a set of covariates The program has now created a set called “Latitude-C Set” [E] which includes the two covariates (Figure 148). When you run the regression the program will display statistics for this set (Figure 149). E Figure 148 | Setup | Naming a set of covariates 215 Figure 149 | Regression | Main results | Working with a set of covariates HOW TO REMOVE A SET • • Highlight the set’s name [F] Click Unlink Covariates [G] F G Figure 150 | Main results | Removing a set of covariates 216 PART 14: INTERACTIONS AND CURVILINEAR RELATIONSHIPS Suppose we run a regression with two covariates, X1 and X2. Consider what happens in two cases — • • When there is not an interaction When there is an interaction There is no interaction if the impact of X1 on the effect size is constant for all values of X2 (and vice versa). There is an interaction between two variables if the impact of one variable depends in the magnitude of the second variable (and vice versa). When there is not an interaction When there is no interaction we include two covariates in the model – X1 and X2. a) B1 gives us the main effect for variable X1, for any value of X2 b) B2 gives us the main effect for variable X2, for any value of X1 When we run a regression with covariates X1 and X2 (but not the interaction), the following is true. • • We assume that the impact of X1 is the same for all values of X2. The p-value for X1 is a test of this constant effect. We assume that the impact of X2 is the same for all values of X1. The p-value for X2 is a test of this constant effect. When there is an interaction By contrast, consider what happens when there is an interaction of X1 and X2. In this case we create a new variable X3 (defined as X1 times X2) and enter all three variables into the prediction equation. a) B1 gives us the first-order effect for variable X1 when X2 is zero b) B2 gives us the first-order effect for variable X2 when X1 is zero c) B3 gives us the impact of the interaction (over and above the first-order effects) When we run a regression with covariates X1 and X2 and the interaction X3, the following is true • We assume that the impact of X1 depends on the value of X2, and so we assess the impact of X1 at a specific value of X2, zero. The p-value is a test of the relationship between X1 and the effect size at this specific value of X2. • We assume that the impact of X2 depends on the value of X1, and so we assess the impact of X2 at a specific value of X1, zero. The p-value is a test of the relationship between X2 and the effect size at this specific value of X1. 217 Centering To center a variable means to re-scale the variable to have a mean of zero. If the original studies took place in the years 1930, 1935, 1940, 1945, 1950, we could subtract 1940 from each value to yield scores of -10, -5, 0, 5, and 10. If the original covariate is Year, the new one could be called Year-C. When we include X1 and X2 (but not the interaction, X3) in the equation, the decision to center (or not) has no impact on the p-value for the individual covariates. By contrast, if we include X1, X2, and the interaction X3 in the equation, then the decision to center (or not) will have a substantial impact on the p-values for the individual covariates. This is because we test X1 for the case where X2 is zero (and vice-versa). For example, suppose that X1 and X2 are Latitude and Year. • Consider the statistics for Year. If we don’t center Latitude, then we test the impact of Year when Latitude is 0. If we do center, then we test the impact of year when Latitude-C is 0, and (it follows) Latitude is around 33. • Consider the statistics for Latitude. If we don’t center Year, then we test the impact of Latitude when Year is 0. If we do center, then we test the impact of Latitude when Year-C is 0 and the actual year is 1948. Centering is also important if we want to assess the impact of curvilinear relationships. For example, suppose that we want to see if the relationship between Latitude and effect size has a curvilinear component. For this purpose we need to enter both Latitude and Latitude2 as covariates. If we don’t center Latitude, then these two covariates will be highly correlated with each other, and it will be difficult to disentangle the linear from the curvilinear components. By contrast, if we center latitude and square the centered value, the correlation between the two will be low, and we will be able to identify the unique impact of each. For these reasons, in the examples that include an interaction (or a curvilinear term) we use variables have been centered about their mean (for continuous variables) or dummy coded (as explained below) for categorical variables, so that zero is a meaningful value or category. The program will not automatically create centered variables, nor variables for the interaction. While it is possible to enter these variables manually, it’s usually easier to copy the original variables to Excel™, create the new variables, and then copy these back into CMA. Important note. As always, we assume that the reader who plans to work with interactions has a good understanding of these from primary regression, and focus here on the elements that are specific to meta-analysis and to the use of this program. Similarly, the very brief overview of centering does not fully address the implications of scaling or centering, or other issues which may affect the results. The same rules apply for interactions involving categorical variables, continuous variables, or combinations of the two types. For clarity, we present an example for each of three cases. These are 218 • • • The interaction of two categorical covariates The interaction of a categorical covariate with a continuous covariate The interaction of two continuous covariates 219 INTERACTION OF TWO CATEGORICAL COVARIATES The original data set includes the covariates Latitude and Year, both of which are continuous. For purposes of this discussion we need two categorical covariates, and we create them by dichotomizing Latitude and Year (see Appendix 5: Creating variables for interactions). • • • Hot is coded 1 if the latitude is 34 or less, and is coded 0 if the latitude is exceeds 34. Recent is coded 1 if the Year is 1945 or later, and is coded 0 if the Year is earlier than 1945. Hot x Recent is created by multiplying Hot x Recent. Figure 151 shows the model, Figure 152 shows the main results, which are plotted in Figure 153 and Figure 154. While the program can compute the statistics for interactions it cannot plot these interactions, and therefore these plots were created in Excel™ (see appendix). Figure 151 | Setup | Interaction of two categorical covariates 220 A B C D E F Figure 152 | Main results | Interaction of two categorical covariates 221 We can display these results as shown in Table 6. Table 6 Climate Cold Hot Time Early −1.1154 −0.2164 Recent −1.4416 −0.3035 Instructions for creating this table are given in the appendix, but the computation is quite intuitive. Since the covariate values for all cells are a combination of 0s and 1s, the expected effect size in each cell is given by the sum of the relevant coefficients from Figure 152. • • • • Upper-left cell is the Intercept (−1.1154) = −1.1154 Upper-right cell is the Intercept plus Recent (−1.1154 −0.3261) = −1.4416 Lower-left cell is the Intercept plus Hot. (−1.1154 + 0.8990) = −0.2164 Lower-right cell is the Intercept plus Recent, Hot, and the Interaction. (−1.1154 −0.3261 −.8990 + 0.2391) = −0.3035 Once we have this table, it’s a simple matter to plot the main effects and their interactions as shown in the following plots. 222 Hot Does the vaccine’s effect differ as a function of climate? Since the equation includes the interaction term, the impact of climate is not a main effect. Rather, it is a first-order effect that may differ for Early studies vs. Recent studies. In Figure 153, The impact of climate for Early studies (Recent = 0) is indicated by the arrow labeled [B}. The impact of climate for Recent studies (Recent = 1) is indicated by the arrow labeled [BB]. The impact of climate is tested for the Early studies [B] since these are the studies coded 0 for Recent. The p-value for this difference is 0.0184 as shown in Figure 152 [B]. B BB Figure 153 | Plot | Interaction of two categorical covariates 223 Recent (vs. Early) Did the impact of the vaccine change from the Early studies to the Recent studies? Since the equation includes the interaction term, the impact of Recent may differ for Cold studies vs. Hot studies. In Figure 154, The impact of Recent for Cold studies (Hot = 0) is indicated by the arrow labeled [B]. The impact of Recent for Hot studies (Hot = 1) is indicated by the arrow labeled [BB]. The impact of Recent is tested for the Cold studies [B] since these are the studies coded 0 for the covariate (Hot). For these studies the regression line increases slightly as we move from Early to Recent studies. However, the corresponding p-value for Recent is 0.4311 as shown in Figure 152 [C]. Thus, there is no evidence that effect size is related to Recent. BB B Figure 154 | Plot | Interaction of two categorical covariates 224 Hot x Recent Does the relationship between Time and effect size vary by Climate? In Figure 155 the regression line for Time in Hot climates is not strictly parallel to the regression line for Time in Cold climates. However, the differences in slopes are minor. Does the relationship between Climate and effect size vary by Time? The impact of Climate in the Recent studies is only slightly larger than the impact of Climate in the Early studies (as indicated by the difference in the height of the two arrows in Figure 146). These two questions are functionally identical, and the same p-value applies to both. Error! Reference source not found. [C] shows the p-value for the interaction is 0.6648. D Figure 155 | Plot | Interaction of two categorical covariates 225 The full set Is there a relationship between Time, Climate, and the interaction (as a set) and the effect size? Since these three are the only covariates in the model, this is addressed by a test of the full model, as shown in Figure 152 [E]. • The p-value of 0.0016 allows us to reject the null hypothesis that none of the covariates is related to effect size. • The R2 analog [F] is 0.66, which tells us that 66% of the initial between-study variance in effect sizes can be explained by this combination of covariates. • Thus, we conclude that the full model (Time, Climate, and the interaction between them) is able to explain at least some of the variance in effect size. 226 INTERACTION OF A CATEGORICAL COVARIATE WITH A CONTINUOUS COVARIATE The original data set includes the covariates Latitude and Year, both of which are continuous. For the purpose of this example we created the following variables based on Latitude and Year (see Appendix 5: Creating variables for interactions). • • • Year-C is the study year, centered Hot is coded 0 for studies in Cold climates, and 1 for studies in Hot climate Year-C x Hot is the interaction Figure 156 shows the model, Figure 157 shows the main results. While the program can compute the statistics for interactions it cannot plot these interactions, and therefore the plots (Figure 159, Figure 158, and Figure 160) were created in Excel™ (see appendix). Figure 156 | Setup | Interaction of categorical and continuous covariates 227 A B C C D E Figure 157 | Main results | Interaction of categorical and continuous covariates 228 In two-way analysis of variance, we sometimes display the results as a 2x2 table, and we can do the same here, as shown in Table 7. Table 7 Climate Cold Hot Year-C -15 20 −1.0184 −1.9286 −0.0276 −0.3929 Instructions for creating this table are given in the appendix. Once we have this table, it’s a simple matter to plot the main effects and their interactions as shown in the following plots. 229 Hot Does the vaccine’s effect differ as a function of climate? Since the equation includes the interaction term, the impact of climate is not a main effect, but rather varies depending on the time frame. Here, it will be evaluated for studies where Year-C = 0 (and the actual year is 1948) as indicated by the arrow in Figure 158 [C]. When Year-C is zero (and the actual year is 1948) the effect size for the Cold studies (at the bottom of the arrow) is substantially larger than the effect size for the Hot studies (at the top of the arrow). The pvalue for this difference is 0.0007 as shown in [Figure 157 C]. C Figure 158 | Plot | Interaction of categorical and continuous covariates 230 Year-C Did the impact of the vaccine change over time? Since the equation includes the interaction term, the impact of Year is not a main effect. Rather, it is a first-order effect that may vary with climate. In Figure 159, The impact of Year for Cold studies (Hot=0) is indicated by the arrow labeled [B]. The impact of Year for Hot studies (Hot=1) is indicated by the arrow labeled [BB]. The impact of Year is tested for the Cold studies [B] since these are the studies coded 0 for Hot. The effect size increases (moves away from zero) as we move from 1930 to 1970, but the corresponding p-value for Year-C is 0.3430 as shown in [Figure 157 B]. Thus, there is no evidence that effect size is related to year. BB B Figure 159 | Plot | Interaction of categorical and continuous covariates 231 Hot x Year-C Does the relationship between Year and effect size vary by Climate? In Figure 158 the regression line for Year in Hot climates is not strictly parallel to the regression line for Year in Cold climates. However, the the differences in slopes are minor. Does the relationship between Climate and effect size vary by Year? The impact of Climate in 1968 is only slightly larger than the impact of Climate in 1933 (as indicated by the difference in the height of the two arrows in Figure 158). These two questions are functionally identical, and the same p-value applies to both. Figure 157 [D] shows the p-value for the interaction is 0.6343. D Figure 160 | Plot | Interaction of categorical and continuous covariates 232 The full set Is there a relationship between Year-C, Hot, and the interaction (as a set) and the effect size? Since these are the only covariates in the model, this is addressed by a test of the full model [Figure 157 E]. • The p-value of 0.0008 allows us to reject the null hypothesis that none of the covariates is related to effect size. • The R2 analog [Figure 157 F] is 0.68, which tells us that 68% of the initial between-study variance in effect sizes can be explained by this combination of covariates. • Thus, we conclude that the full model (Time, Climate, and the interaction between them) is able to explain at least some of the variance in effect size. 233 INTERACTION OF TWO CONTINUOUS COVARIATES In this example we assess the impact of Year-C, Latitude-C, and Year-C x Latitude-C. For instructions on creating the data set used here, see Appendix 5: Creating variables for interactions • • • Latitude-C is the latitude, centered Year-C is the study year, centered Latitude-C x Year-C is the interaction Figure 161 shows the model, Figure 162 shows the main results, and Error! Reference source not found. shows a plot of these results. While the program can compute the statistics for interactions it cannot plot these interactions, and therefore Error! Reference source not found. was created in Excel™ (see Plotting the interaction of two continuous covariates). Figure 161 | Setup | Interaction of two continuous covariates 234 A B C D E F Figure 162 | Main results | Interaction of two continuous covariates Working with the screen shown in Figure 162 we can create Table 8. Then we use the numbers in this table to create the subsequent plots in Excel™. Details are provided in the appendix. Table 8 Latitude-C -20 0 +21 -15 0.1319 -0.6171 -1.4035 Year-C 20 -0.2796 -0.7835 -1.313 235 Latitude-C Does the vaccine’s effect differ as a function of latitude? Since the equation includes the interaction of Latitude and Year, the impact of latitude is not a main effect, but rather varies depending on the Year. The impact of Latitude on the effect size will be evaluated for studies where Year-C = 0 (and the actual year is 1948) as indicated by the arrow [B] in Figure 163. When Year-C is zero (and the actual year is 1948) the effect size for the studies where Latitude is 55 (at the bottom of the arrow) is substantially larger than the effect size for the studies where Latitude=13 (at the top of the arrow). At this Year there is a statistically significant relationship between latitude and effect size, with p = 0.0066 as shown in Figure 162 [B]. B Figure 163 | Plot | Interaction of two continuous covariates 236 Year-C Did the impact of the vaccine change over time? Since the equation includes the interaction of Year by Latitude, the impact of Year is not a main effect. Rather, it is a first-order effect that may vary with Latitude. In Figure 164, Since the equation includes the interaction term, the impact of Year-C will be evaluated at the point where Latitude-C = 0 (and the actual latitude is 33). This is represented by Line [C] in Figure 164. At this latitude the regression line seems to be relatively horizontal. The corresponding p-value for Year–-C is 0.7594 as shown in Figure 162 [C]. Thus, there is no evidence that effect size is related to year at this latitude. C Figure 164 | Plot | Interaction of two continuous covariates 237 Hot x Year-C Does the relationship between Year and effect size vary by latitude? While the lines for the different latitudes in Figure 165 are not strictly parallel, the differences in slopes are minor and not statistically significant. Does the relationship between latitude and effect size vary by Year? The impact of latitude in 1968 is only slightly larger than the impact of latitude in 1933 (as indicated by the difference in the height of the two arrows in Figure 165). These two questions are functionally identical, and the same p-value applies to both. Figure 162 [D] shows the p-value for the interaction is 0.7181. Figure 165 | Plot | Interaction of two continuous covariates 238 The full set Is there a relationship between Year-C, Latitude-C, and the interaction (as a set) and the effect size? Since these are the only covariates in the model, this is addressed by a test of the full model, as shown in Figure 162 [E]. • The p-value of 0.0097 allows us to reject the null hypothesis that none of the covariates is related to effect size. • The R2 analog [F] is 0.60, which tells us that 60% of the initial between-study variance in effect sizes can be explained by this combination of covariates. • Thus, we conclude that the full model (Year, Latitude, and the interaction between them) is able to explain at least some of the variance in effect size. 239 CURVILINEAR RELATIONSHIPS Earlier, we established that there is a linear relationship between latitude and effect size. Suppose we want to test the hypothesis that the relationship between latitude and effect size is actually curvilinear – for example, that the vaccine’s impact is relatively constant as we move from a latitude of 13 to 30, but then increases as we move from 30 to 55. Or, that the vaccine’s impact increases sharply as we move from a latitude of 13 to 30, but is relatively unchanged beyond that point. A curvilinear relationship can be seen as a kind of interaction, and that is the approach we take here. In any interaction we ask of the impact of one covariate depends on the level of another covariate. Typically, the two covariates are distinct (A and B). Here, they are the same (A and A) but the idea is the same. We are asking if the impact of A depends on the level of A. When working with curvilinear (or higher-order) relationships it’s generally a good idea to center variables, and that’s the practice we follow here. To assess the hypothesis that there is a curvilinear relationship between latitude and effect size we’ll need two covariates • • Latitude-C is simply Latitude centered, to have a mean of zero. Latitude-C2 is the square of Latitude-C. For information on how to create these variables see Appendix 5: Creating variables for interactions. Figure 166 shows the model, Figure 167 shows the main results, and Figure 168 shows a plot of these results. While the program can compute the statistics for curvilinear relationships it cannot plot these relationships, and therefore Figure 168 was created in Excel™ (see appendix). Figure 166 | Setup | Curvilinear relationship 240 A B C Figure 167 | Main results | Curvilinear relationship 241 Figure 168 | Plot | Curvilinear relationship In this example, where the effects are in log units and range downward from zero, an effect size of zero reflects no effect while an effect size of −1.5 reflects a substantial effect. In this example, as the latitude increases the effect size increases (moves away from zero). We want to evaluate the linear component of this relationship, the curvilinear component, and then the two (as a set). Latitude-C The line for Latitude-C in Figure 167 [A] addresses the linear relationship between latitude and effect size. The p-value is < 0.0001, which tells us that the linear relationship is statistically significant. Note. Since we have included an interaction, the linear component varies as a function of latitude. Therefore, the coefficient for Latitude-C is not a slope but rather the tangent to the curved line where Latitude-C is zero. Latitude-C2 The line for Latitude-C2 in Figure 167 [B] addresses the curvilinear component of the relationship between latitude and effect size. The line in the plot is curvilinear, with a slope that is initially shallow but increases as the altitude increases. Is this line a better fit for the data than a straight line would be? This is addressed by the p-value for this covariate, which is 0.2600. There is no evidence that the relationship is curvilinear. Test of the model Since the only covariates in the model are Latitude-C and Latitude-C2, the model [C] tests the null hypothesis that both coefficients are zero. In Figure 167 [C] the Q-value is 17.90, and with 2 degrees of 242 freedom the p-value is 0.0001. We can conclude that there is a relationship between these covariates (as a set) and the effect size. 243 PART 15: MISSING DATA In regression for meta-analysis, as in regression for primary studies, there are many options for dealing with missing data. The program takes a very simple approach to missing data, as follows. If a study is missing data for the outcome or for any of the covariates in the covariate list that study is excluded from the analysis. Note that this exclusion is based on all covariates listed on the main screen, and not only on the covariates that are checked. Figure 169 | Setup 244 To show how the program handles missing data we need to create some missing data. For this purpose, • • • Return to the data-entry screen (Figure 170) Highlight the three cells in the latitude column for studies 9, 10, 11 [A] Press the Delete key A Figure 170 | Data-entry | Missing data for latitude 245 Run the analysis. The main analysis screen (Figure 171) shows that the three studies are missing latitude [B]. These studies are still included in this analysis of the mean effect. B Figure 171 | Basic analysis | Missing data for latitude • • Proceed to the regression module Create a model that includes latitude, and tick latitude, as in Figure 172 [C] C Figure 172 | Regression | Setup | Latitude in list and checked 246 The results are shown in Figure 173. Note that the analysis is based on 10 studies [D] rather than 13, since three studies have been excluded. D Figure 173 | Regression | Main results | Missing data 247 • • To see which studies have been excluded click More results > All data The program displays a line for every study in the database, and missing data points are highlighted in red (Figure 174). E Figure 174 | Table of missing data This is a good way to identify the missing data and also to identify patterns of missing data. • • If data is missing primarily for one covariate across a lot of studies you may decide to remove that covariate from the analysis. If data is missing primarily for a few studies across many covariates you may decide to remove those studies and keep the covariates. Of course, the decision to adopt one of these approaches or some other will depend on a host of factors, with attention paid to avoiding bias. However, the ability to identify the patterns of missing data is a crucial first step in this process. In some cases, you may want to use another approach for missing data. For example, you may want to replace missing data with the mean, or a value imputed in some other way. You can do this by returning to the main data-entry screen and simply entering the desired value in place of the missing value. In a more sophisticated version of this scheme you can create several variables based on the same variable, but using different approached to missing data. For example, suppose the initial variable is Dose. You can create one variable called DoseA that replaces missing data with the mean, and another variable called DoseB that replaces missing data with another imputed score. Then, in any given model you would use one or the other, but not both. 248 Important Missing data is based on all covariates in the list, and not only those that are checked. The program works this way to ensure that if you define two or more prediction models, the same studies will be used in all the models. In Figure 175 we have un-checked Latitude [F], but the three studies are still excluded from the analysis as we can see in Figure 176 [G]. F Figure 175 | Setup | Latitude in list, unchecked 249 G Figure 176 | Main results | Latitude in list, unchecked In this example the studies are being excluded because they are missing a value for latitude. How can we include these studies in the analysis? • If we want to use latitude as a predictor, the only option is to return to the data-entry screen and enter a value for latitude for each of these studies. • If we are willing to run the regression without latitude, then we need to remove latitude from the list of covariates on the main screen. 250 It is not sufficient to simply un-tick latitude. Rather, we must remove it as shown in Figure 177. • • Highlight “Latitude” [G] Click “Remove covariates” [H] G H Figure 177 | Setup | Latitude must be removed from list 251 Figure 178 | Setup | Latitude removed from list We now run the analysis again (Figure 179) and see that the number of studies [I] has returned to 13. I Figure 179 | Main results | Latitude removed from list 252 Note. There are more sophisticated methods for handling missing data, such as multiple imputation and selection models for non-ignorable missing data. While these are beyond the scope of this manual, these and other methods can be implemented using CMA. You would use an external program to determine the data value for each study, and then input this value via the data-entry screen. In this chapter we assumed that the each study has been entered into the database as one row, and the data for that row is either present or absent. In Part 18: Complex data structures we show how to create data by combining data across subgroups, outcomes, or time-points, and how missing data is handled in that case. 253 PART 16: FILTER STUDIES In some cases you may want to run a regression using a subset of the data. For example, you may want to limit the analysis to studies that employed acceptable methods for randomization and doubleblinding. Or, you may want to limit the analysis to studies that were performed within the past ten years, or to studies that employed specific variants of the intervention or that enrolled persons from specific populations. This procedure is called “Filtering”, in that we create a filter, and only studies that pass through the filter are submitted for the regression. The process is actually very simple, but requires that you understand the relationship among three distinct modules in the program. This is shown schematically in the following three figures. 1. In the data-entry module we enter data for all studies (Figure 180) 2. In the main analysis module we can create filters (Figure 181) 3. Studies that pass through the filters are submitted to the regression module (Figure 182) Data-Entry Figure 180 | Data entry Basic-Analysis All filtering is done here Figure 181 | Basic analysis 254 Meta-regression Figure 182 | Meta-regression We provide a few examples of filtering Example 1 Suppose you want to exclude two specific studies (Aaronson, Stein & Aaronson) by name On the main analysis screen (Figure 183), right-click on the names and select “Select by Study name” Figure 183 | Select by study name In Figure 184, un-tick the two studies and click [Ok] 255 Figure 184 | Select by study name The studies main analysis is now based on the remaining 11 studies. When you proceed to metaregression, only these studies will be transmitted. 256 Example 2 The process of excluding studies by name works well if you only need to exclude a few studies, but becomes tedious and error-prone if you have a large database and need to exclude many studies. In this case, there are better options. Suppose that you want to run a series of analyses using a specific subset of the studies. Create a categorical moderator (let’s call this Set-A) and code each study as belonging to this set (or not) as in Figure 185 [A]. A Figure 185 | Create a moderator for filtering 257 On the main analysis screen • • • • Click Computation options > Select by Click the tab for Moderator Select Set-A De-select “No” The analysis is now based on the eight studies that had been coded “Yes”. The mean effect size is −0.5894 [B]. B Figure 186 | Filter by moderator 258 Click Analysis > Meta-regression 2 Run the regression D C Figure 187 | Regression using a filter The regression is based on these eight studies [C]. The mean effect size (the intercept) is −0.5894 [D]. You may create a series of these “Sets”, and easily switch between them. Run the regression for Set-A, and then for Set-B. 259 Example 3 You can filter studies based on existing moderators. For example, suppose you wanted to run an analysis using studies in a Hot climate that employed either Systematic or Random allocation. • • • Click Computational options > Select by > Select Climate Tick [Hot] Figure 188 | Select by moderator 260 • • • Click Add Filter Select Allocation Tick “Random” and “Systematic” E Figure 189 | Select by moderator The analysis is now based on studies that meet both criterion, as shown in Figure 189 [E]. To ensure that things are working as intended, after running the regression click on All studies to see which have been used in the regression (Figure 190). Figure 190 | Filter by moderator The studies in the regression are the same ones that were included in the main analysis when we applied these filters (Figure 189). 261 PART 17: DEFINING SEVERAL MODELS Typically you will create one prediction model, which is the list of covariates to be included in the analysis. Then, you might try another model, and another, working one model at a time. Figure 191 | Defining several models | Setup The program offers another option – to define a number of prediction models at once, and then run them all simultaneously. Here, for example, the user has defined one model that includes only the intercept [A], a second that adds latitude [B], and a third that adds year [C]. A B C Figure 192 | Defining several models | Setup 262 Then, when she runs the analysis the program runs all of the models as shown in Figure 193. The user can switch among them by using tabs at the bottom of the screen [A]. A Figure 193 | Defining several models | Main-analysis | Intercept In Figure 193 the user has clicked on the tab labeled “Intercept” [A]. The screen displays the statistics for the analysis based on intercept alone. 263 B Figure 194 | Defining several models | Main-analysis | Intercept + year In Figure 194 the user has clicked on the tab labeled “+Year” [B]. The screen displays the statistics for the analysis based on intercept and year. 264 C Figure 195 | Defining several models | Main-analysis | Intercept + year + latitude In Figure 195 the user has clicked on the tab labeled “+Latitude” [C]. The screen displays the statistics for the analysis based on intercept, year and latitude. 265 Why would we want to define more than one model? In the running example we defined three models on the main screen. Why take this approach, rather than simply working with one prediction model at a time? There are two reasons why this option may be useful. First, the program summarizes the results of all models on one screen (Figure 196). To navigate to this screen click More results > [Compare models Detailed] Figure 196 | Defining several models | Main-analysis | Intercept + year + latitude 266 Second, the program displays a test for the difference in the explanatory power of the models (Figure 197). For example, the cell indicated by [A] compares the model that includes intercept and latitude with the one that adds year as well. This option is only available when one model is a subset of the other. When this is not the case, the corresponding cell will be left empty. Figure 197 | Defining several models | Main-analysis | Intercept + year + latitude How do we choose what covariates to include in each model? This depends on the questions we want to address. For example, suppose the primary goal of the analysis is to assess the impact of treatment. • • • One series of covariates such as mean age and location is seen as noise One series of covariates represents treatment condition One series (such as dose by treatment) represents potential interactions We might define one model as “Nuisance”, a second as “Plus Treatment”, and a third as “Plus interactions”. Then, the summary screen provides a quick look at the three models while the comparison screen shows the statistical tests of the differences among them. 267 Working with multiple predictive models As shown in Figure 198, to create a series of prediction models, use the toolbar [A] to • • • • Insert new models Delete models Rename models Move models left or right A Figure 198 | Defining several models | Setup 268 Two common scenarios for multiple models are a diagonal sequence and an incremental sequence. The program can generate these kinds of series automatically. The incremental sequence is shown in Figure 199. The first model includes the intercept only, and then one covariate is added at each step. The name for each variable is “plus” that variable, since the variable has been added to the covariates. To create this sequence click Generate sequence > Incremental sequence [A] A Figure 199 | Defining several models | Main-analysis | Intercept + year + latitude The diagonal sequence is Figure 200. Each model includes the intercept plus one covariate. The name for each variable is that variable alone. To create this sequence click Generate sequence > Diagonal sequence [B] B Figure 200 | Defining several models | Setup| Year or Latitude 269 Note. When you run multiple models, always be sure to select the desired tab at the bottom when studying the results. These tabs control all tables that are model-specific. In the earlier example we created models called “Intercept”, “+Year”, and “+Latitude”. Then we click on “Scatterplot”. We need to select a tab at the bottom of the screen to select the model for the scatterplot. In Figure 201 we’ve clicked “+Year” [B] and the scatterplot is based on the intercept and latitude (as we can see from the prediction equation [C]). C Figure 201 | Multiple predictive models | Plot based on Year B 270 By contrast, in Figure 202 we’ve clicked “+Latitude” [D] and the scatterplot is based on the intercept, latitude, and year, as we can see from the prediction equation [E]. E Figure 202 | Multiple predictive models | Plot based on Year + Latitude D 271 The same holds true for the most screens. Specifically, as shown in Table 9 • Screens that present results for one predictive model will change as the user selects one or another model using the tabs at bottom. • Screens that collate results for all one predictive models do not change as the user selects one or another model using the tabs at bottom. Table 9 Screen Main results Scatterplot R2 graphic Covariance Correlation Diagnostics All studies Valid studies Increments Models summary Compare models (detailed) Compare models (p-values) Varies by model X X X X X X Identical for all models X X X X X X 272 PART 18: COMPLEX DATA STRUCTURES On the main data-entry screen every row usually represents a single study. If there are 20 studies on the data-entry screen there will be 20 studies on the main analysis screen, and 20 studies transmitted to the meta-regression module. However, in the case of complex data sets the situation is a bit more complicated. There are two kinds of complex data that we need to address. • • One is the case where we include two or more independent subgroups for some (or all) studies. The other is the case where we include two or more non-independent outcomes, time-points, or comparisons for some (or all) studies. The program has a mechanism in place for dealing with these studies. The key to this mechanism is that all the data filtering and merging is performed on the main data-analysis screen. The rows displayed on this screen are the rows that will be transmitted to the regression module. In the regression module these rows will be treated as independent of each other. INDEPENDENT SUBGROUPS WITHIN STUDIES Consider the case where studies assess the impact of a drug, and report the results separately for males and for females. To record data for Independent subgroups (where each subject appears in one subgroup or the other, but not both) we use Insert > Column > for subgroups Figure 203 | Data-entry | Complex data-structures In this example we have five studies, and each reports the effect separately for males and females. For illustrative purposes we’ve set the effect size 0.20 points higher for males vs. females, and we’ve set the variance the same (0.10) for all subgroups for all studies. 273 Figure 204 | Data-entry | Complex data-structures We have two options. • • We can use the subgroup as the unit of analysis We can combined subgroups for each study, and use study as the unit of analysis Using subgroup as the unit of analysis We can use subgroup as the unit of analysis. In this case the variance of the summary effects should be 0.10/10, or 0.01, and the standard error should be 0.10. On the data-analysis screen we right-click on the column labeled “Subgroup within study” and select “Use subgroup as the unit of analysis” Figure 205 | Basic analysis | Subgroup within-study as unit of analysis 274 B A Figure 206 | Basic analysis | Subgroup within-study as unit of analysis As expected, in Figure 206 the variance [A] is 0.01 and the standard error [B] is 0.10. Now, we proceed to the meta-regression module. 275 For illustrative purposes we’ll run a regression with only the intercept D C Figure 207 | Regression | Subgroup within-study as unit of analysis • • In Figure 207 the number of studies in the analysis [C] is 10. The standard error [D] is 0.10 (which implies a variance of 0.01). These are the same numbers we saw in Figure 206. 276 Finally, we can click on More Results > All studies to display the data being used in the meta-regression (Figure 208). We see that the program is, in fact, working with the same 10 units that we had seen on the main analysis screen (Figure 206). Figure 208 | Regression | Subgroup within-study as unit of analysis Using study as the unit of analysis Immediately above we used subgroup as the unit of analysis. We also have the option of using study as the unit of analysis. In this case the program will merge the data for the subgroups within each study to yield study-level data. This data will then be used in the main analysis, and also in the meta-regression. Since we are working with independent subgroups within each study, the variance of the estimate should be approximately the same regardless of whether we choose to use (a) subgroup or (b) study as the unit of analysis. Above, we saw that when we used subgroups as the unit of analysis the variance of each subgroup was 0.10, and so the variance of the combined effect was 0.10/10=0.01. When we use study as the unit of analysis the variance within each study is 0.10/2 (for two subgroups), or 0.05. Then, the variance of the combined effect is 0.05/5, which (as before) is 0.01. 277 To see this, let’s return to the main analysis screen and select Use Study as the unit of analysis Figure 209 | Basic analysis | Study as unit of analysis B A Figure 210 | Basic analysis | Study as unit of analysis In Figure 210 variance for each study is 0.05, the variance for the combined effect [A] is 0.01, and the standard error [B] is 0.10. These are the same numbers we saw in Figure 206. 278 Again, we can proceed to meta-regression and run the analysis with only the intercept. D C Figure 211 | Regression | Study as unit of analysis • • In Figure 207 the number of studies in the analysis [C] is 5. The standard error [D] is 0.10 (which implies a variance of 0.01). These are the same numbers we saw in Figure 210. 279 Finally, we can navigate to More Results > All Data. Figure 212 shows that we now have five studies rather than 10 subgroups, and the variance for each study is 0.05. Figure 212 | Regression | Study as unit of analysis The point is, when we have independent subgroups within studies, every subgroup yields independent information, and must be treated as such in the analysis. We may elect to use subgroup as the unit of analysis or we may elect to use study as the unit of analysis, but in either case the within-study variance for the combined effect should be approximately the same. It is in the main analysis, and it is in the regression. If the two options yield the same result here, does it matter which one we use? Yes, it does. In this example we created a homogeneous set of studies for illustrative purposes, and tau-squared was zero. When tau-squared is zero, the two approaches yield very similar (if not identical) results. By contrast, in a real analysis the estimate of tau-squared will often be different if based on 10 subgroups as compared with 5 studies. Concretely, if the effect sizes tend to vary a lot from one study to the next, but to be relatively similar for the subgroups within a study, it follows that tau-squared based on studies will tend to be larger while tau-squared based on subgroups will tend to be smaller. The decision to use on or the other depends on how you see the sampling frame, and the population to which you want to generalize. If you want to get a sense how the effects are distributed across studies, then it makes sense to use study as the unit of analysis. If you want to get a sense of how the effects are distributed across subgroups, then it makes sense to use subgroups as the unit of analysis. As noted above, the precision for estimating the mean effect will likely be similar in the two cases. However, the estimate of the variance itself (and statistics that depend on this estimate) will differ. This applies to both the main analysis and also to the regression. For regression, there is an additional consideration, as follows. 280 Suppose that the subgroups are Male and Female. If you intend to use gender as a covariate, then the only option would be to use subgroup as the unit of analysis. Even if you don’t intend to use gender as a covariate, using subgroup as the unit of analysis may allow you to work with a finer level of data. For example, suppose you have the mean age for each subgroup, and plan to use mean age as a covariate. If you use subgroup as the unit of analysis you can use the mean age for each subgroup. By contrast, if you use study as the unit of analysis you’ll need to use the mean age for the study, and any difference in age between subgroups will be lost. 281 MULTIPLE OUTCOMES OR TIME-POINTS Above, we addressed the case where we have independent subgroups within studies. The key was that each subgroup represented independent information (a person was included in one subgroup or another, but not both) and was treated as such in the analysis. Now, we turn to the case where we have multiple outcomes or time-points within studies. The key here is that the rows for each study are based (at least partly) on the same persons, and do not provide independent information. We’ll use the example of multiple outcomes and then comment on the issue of multiple time-points below. To highlight the difference between independent subgroups on the one hand, and multiple outcomes on the other, we’ll use the same data as before. However, this time we’ll identify the rows within each study as being for two outcomes (reading and math) rather than for two subgroups (male and female). When creating the data file we use Insert > Column for > Outcome names (Figure 213). Figure 213 | Data-entry | Multiple outcomes The data are shown in Figure 214. Figure 214 | Data-entry | Multiple outcomes Proceed to the main analysis screen 282 The program initially shows an analysis for Math only (Figure 215). Figure 215 | Basic analysis | Multiple outcomes | Select one outcome Right-click on Outcome to see the following options ( Figure 216) • • • Use the mean of the selected outcomes Use all of the selected outcomes, assuming independence Use the first outcome, based on this sequence Figure 216 | Basic analysis | Multiple outcomes | Use all outcomes, assuming independence Here, we select “Use all of the selected outcomes, assuming independence”. 283 We present this option here to illustrate its impact, and not to suggest that this is generally a valid option. In fact, we would consider using this option only when (a) there is only minor overlap in the samples and/or (b) the correlation between outcomes is small. Otherwise, this option will underestimate the variance (over-estimate the precision) of the summary effect size. By selecting this option we are treating the correlation between outcomes as zero, which is (in effect) what we did with two independent subgroups. It follows that the results should be the same as they had been before. In fact, in Figure 217 the combined effect has a variance [A] of .01 and a standard error [B] of .10, which are the same numbers we saw in Figure 206. B A Figure 217 | Basic analysis | Multiple outcomes | Use all outcomes, assuming independence As before, we’ll run a meta-regression with only the intercept (Figure 218), to show that the regression will yield the same results as the traditional analysis. Figure 218 | Multiple outcomes | Setup 284 The results (Figure 219) shows a standard error [C] of 0.1000, with an implied variance of 0.0100, the same numbers as we saw in Figure 217. C Figure 219 | Multiple outcomes | Use all outcomes, assuming independence 285 Finally, More results > All data displays the actual rows of data in the analysis (Figure 220), which are identical to those in Figure 217. Figure 220 | Multiple outcomes | Use all outcomes, assuming independence The point of this exercise was to show that when we treat the outcomes as independent of each other, the impact is the same as when we are working with independent subgroups. The program treats every line of data as though it is providing new (unique) information. While this is usually appropriate for independent subgroups, it is rarely appropriate for multiple outcomes. We will usually want to treat the outcomes as dependent, and compute the variance accordingly, as shown here. Back on the main analysis screen (Figure 221) select “Use the mean of the selected outcomes” [A] 286 A Figure 221 | Basic analysis | Multiple outcomes | Use mean of outcomes In Figure 222 the program combines the data for math and reading to yield a “Combined” score for each study [B]. B D C Figure 222 | Basic analysis | Multiple outcomes | Use mean of outcomes In computing the variance for each study the program assumes a correlation of 1.0 between reading and math. The variance for reading is 0.10, the variance for math is .10, and the variance for the study-level composite is also 0.10. This follows from our decision to treat the correlation between reading and math as 1.0. This means that the second outcome is providing no new information (and so has no impact on the variance). This approach is conservative, in the sense that the true correlation is usually 287 less than 1; the second outcome probably provides some new information; and the true variance is probably lower than the value we are using. As always, once we have a variance for each study we can compute the variance for the combined effect size. In Figure 222 the variance of the combined effect[C] is 0.10/5, or 0.02. This is twice as large as the value in Figure 217, which treated the outcomes as independent of each other. Similarly, the standard error of the combined effect [D], computed as the square root if the variance, is now 0.1414, as compared with 0.1000 in Figure 217. Now, we proceed to meta-regression (Error! Reference source not found.). 288 The results for the regression (Figure 223) are identical to those for the traditional analysis (Figure 222). In particular, the standard error [E] is 0.1414, which implies a variance of 0.0200. E Figure 223 | Multiple outcomes | Use mean of outcomes 289 Figure 224 | Multiple outcomes | Use mean of outcomes Finally, we can click More results > All studies to see data rows that are being transmitted to the regression (Figure 224). There are five rows, and the variance for each is 0.10, precisely as we had seen in Figure 222. Other options Our goal here was to show how the transformations and filtering specified in the main analysis are carried over to the regression. For this purpose we outlined two options, to treat the outcomes as independent or to combine them into a composite score assuming a correlation of 1.0. The program includes a number of other options for working with multiple outcomes. These are not discussed here, but could be used in the main analysis screen and would carry over to the regression. These include selecting one outcome in preference to the others, and taking the mean of some outcomes while excluding others. While the program always assumes a correlation of 1 when it automatically generates composite scores, you can create composite scores manually using any correlation. 290 How covariates are merged To this point we’ve shown what happens with the effect size and variance when we are working with multiple subgroups or outcomes within a study. Now, we need to consider what happens with the covariates. For example, suppose the subgroups are male and female, and each has a value for “Age”. If we merge the data from the two subgroups, what do we use for “Age”? Independent subgroups within a study The two options are • • To use subgroup as the unit of analysis To use study as the unit of analysis If we use subgroup as the unit of analysis then each subgroup has its own value for each covariate, and this value gets transmitted along with the effect size. However, if we use study as the unit of analysis then we need one value for the study, and we need to consider how this value is created. The rules are as follows. • • If both subgroups have the identical value (for example, age is coded as 40) then this same value is assigned to the study. If the value differs from one subgroup to the next within a study (one subgroup has age 40 and another has age 50) the covariate is transmitted as missing, and the study will be excluded from any regression that uses this covariate (see section on missing data). You can use the “All studies” table to see where data are missing. If appropriate, you can return to the data-entry screen and modify the data (perhaps use age 45 for both). The same idea is applied to multiple outcomes. • If we use all outcomes then each outcome has its own value for each covariate, and this value gets transmitted along with the effect size. • If we use the mean of outcomes then we need one value for the study. If both outcomes have the identical value (for example, age is coded as 40) then this same value is assigned to the study. If the value differs (one outcome has age 40 and another has age 50) the covariate is transmitted as missing, and the study will be excluded from any regression that uses this covariate (see section on missing data). You can use the “All studies” table to see where data are missing. If appropriate, you can return to the data-entry screen and modify the data (perhaps use age 45 for both). Multiple time-points The case of multiple time-points works the same as multiple outcomes. 291 MULTIPLE COMPARISONS Here, we use the term “multiple comparisons” to refer to a study where the same control group serves as the comparator for a series of Drugs. If you will look at one drug at a time, then the Control-group data appears only once in each analysis, in which case one could argue that each analysis is valid on its own. However, if you will look at analysis that compares all treatments vs. the control, then we need to address the fact that the same control group will appear more than once in the data. There are several options that can be taken • • One option is to use the same approach discussed for multiple outcomes. Another option is to divide the control group into segments when entering the data, and assign each segment to one treated group. For example, if the Control group has 20 events over 100 people, treat is as being two control groups with 10 events and 50 people in each. This option is explored in “Introduction to Meta-Analysis”. 292 PART 19: SOME CAVEATS 293 STATISTICAL POWER FOR META-REGRESSION Statistical power is the likelihood that a test of significance will reject the null hypothesis. In the case of meta-regression it is the likelihood that the Z-test of a single covariate or the Q-test of a set of covariates will yield a statistically significant p-value. Power depends primarily on two factors, • • The magnitude of the effect The precision with which we measure the effect In a fixed-effect analysis precision is driven primarily by the total number of individual subjects across all studies. In a random-effects analysis precision is driven by the total number of individual subjects across all studies, and also by the variance in treatment effects and the total number of studies. (cite Hedges and Pigott, 2004?) While there is a general perception that power for testing the main effect is consistently high in metaanalysis, this perception is not correct, and certainly does not extend to tests of effects in metaregression. (Hedges and Pigott, 2001) The failure to find a statistically significant p-value in metaregression could mean that the effect (if any) is quite small, but could also mean that the analysis had poor power to detect even a large effect. One should never use a non-significant finding to conclude that a covariate (or a set of covariates) is not related to effect size. 294 MULTIPLE COMPARISONS In primary studies researchers often need to address the issue of multiple comparisons. The basic problem is that if we conduct a series of analyses with alpha set at 0.05 for each, then the overall likelihood of a type I error (assuming that the null is actually true) will exceed 5%. In the context of regression, this could be a problem if we test a series of covariates. Some suggest simply allowing for a 5% error rate for each covariate. To use this approach we can simply work with the p-values on the main results screen. In this case it’s a good idea to evaluate the data in context. For example, one significant p-value in forty tests would be suspect. Some suggest conducting an omnibus test that asks if there are any non-zero effects, and then proceeding to look at pair-wise comparisons only if the initial test meets the criterion for significance. To implement this approach we could first look at the p-value for the model. Or, if some of the covariates are nuisance variables (potential confounds that we want to hold constant) while others are of interest (say, a set of covariates that represent the treatment condition) we could test the increment for this set and use this as the gateway p-value. Others suggest going straight to the individual covariates but using a stricter criterion for significance (for example, a criterion alpha of 0.01 rather than 0.05 for five tests). Others (e.g., Rothman 1990) suggest that there are many cases where we can safely ignore the problem of multiple comparisons. The same issues exist in meta-analysis and the approaches outlined above for primary studies can be applied to meta-analysis as well. (e.g., Hedges and Olkin, 1985). 295 296 PART 20: TECHNICAL APPENDIX 297 APPENDIX 1: THE DATASET The motivating example in this book is the BCG dataset. • • The Excel™ version is called BCG.xls The CMA version is called BCG.cma See Part 1: Data files and downloads for location of files The original dataset includes only a few moderators (Latitude, Year, Allocation) as shown in Figure 225. We needed to create some additional moderators for purposes of this book. Some of these are shown in Figure 226 and all of these are listed in Table 10. Table 10 also shows how we can create the new moderators using Excel. If we start with the dataset in CMA, it is possible to create new moderators directly in CMA, but the process can become tedious. An easier approach is to copy the relevant block of data from CMA to Excel™, add the new columns in Excel™, and then copy these new columns back into CMA. Then, in CMA, we just need to assign a name to each new column, identify it as a moderator, and assign a type (Categorical, Integer, or Decimal). Table 10 shows the formulas that were employed in Excel™ to assign values to each study for the new variables. Figure 225 | BCG Data in Excel™ 298 Figure 226 | BCG Data in Excel™ 299 Table 10 Column Variable Type Formula for Row 4 The following are the original variables J K L Latitude Year Allocation Integer Integer Categorical Variables related to latitude We compute the mean Latitude, which is 33.46154, in cell J18 LatitudeC is Latitude minus the mean LatitudeC2 is the square of LatitudeC Climate is a categorical variable, coded either Hot (latitude < 40) or Cold. Hot and Cold are dummy variables based on Climate. M N O P Q LatitudeC LatitudeC2 Climate Hot Cold Integer Integer Categorical Integer Integer =K4-$J$18 =M4^2 =IF(J4<40,"Hot","Cold") =IF($O4="Hot",1,0) =IF($O4="Cold",1,0) Variables related to Year We compute the mean Year, which is 1948.0769, in cell L18 YearC is Year minus the mean Time is a categorical variable, coded either Early (Year < 1945) or Recent. Early and Recent are dummy variables based on Climate. R S T U YearC Time Early Recent Decimal Categorical Integer Integer =K4-$K$18 =IF(K4<1945,"Early","Recent") =IF($S4="Early",1,0) =IF($S4="Recent",1,0) Interactions Hot x YearC is the interaction of Hot by Year (centered) Hot x Recent is the interaction of Hot by Recent LatitudeC x YearC is the interaction of Latitude (centered) by Year (centered) V W X Hot x YearC Hot x Recent LatitudeC x YearC Decimal Integer Decimal =P4*R4 =P4*U4 =M4*R4 The “IF” formulas in Excel™ take the source column and recode it. Alternatively, you can simply type in the new values. Normally, you do not need to create the dummy variables such as Hot and Cold. Rather, you can enter categorical variables into the regression and the program will create the dummy variables on the fly. You do need to create the dummy variables (as shown here) if you will be running regression without the intercept, or if you plan to use them in interactions. 300 In Figure 226, row 2 indicates if each variable is categorical, integer, or decimal. In CMA, when you define a variable as a moderator, you must also specify that it is one of these types. A note on Categorical and Dummy variables A categorical variable is a variable where the codes represent distinct groups, and it makes sense to say that the groups are different from each other, but not that one is “more than” another. A common example is Gender. Usually, each study is assigned a text value (Male or Female) for the variable. A categorical variable cannot be used directly in the regression. Rather, we create dummy variables that contain the same information as the categorical variables, and use these in the analysis. In most cases, the program is able to create the dummy variables on the fly. Concretely, if you enter a categorical variable into the analysis, the program will create one (or more) dummy variables, assign a code for each study, and enter these dummy variables into the analysis. When this approach can be used, it offers the advantage of speed and simplicity. However, this approach can only be used in simple regressions. This approach cannot be used if • • • We want to omit the intercept from the prediction model We want to use effects-coding, contrast-coding, or some other scheme We want to create interaction terms that include the categorical variable In any of these cases, we need to create the dummy codes manually and enter the data as we would for any other study. Note that the dummy variables must be identified as integer or decimal, and not as categorical. For example, • • Climate is categorical (the values are Cold and Hot). However, the variable called Cold is an integer variable (the values are 0 and 1) and the variable called Hot is an integer variable (the values are 0 and 1). Time is categorical (the values are Early and Late). However, the variable called Early is an integer variable (the values are 0 and 1) and the variable called Recent is an integer variable(the values are 0 and 1). 301 APPENDIX 2: UNDERSTANDING Q Introduction In a simple meta-analysis we compute Q, a number that reflects the dispersion of all effect sizes about the mean effect size. Then we use Q to compute an array of indices such as T, T2, and I2, that address specific aspects of dispersion. We briefly review the computation of Q and its relationship to these indices. Then we proceed to the primary focus of this appendix, to show that the computation of Q and the other indices is fundamentally the same for a simple analysis (with one group of studies), for a subgroups analysis, and for a regression. In all three cases we compute Q by working with the deviation of effect sizes from the predicted effect size. In a simple analysis the predicted effect size is the mean of all studies, in a subgroups analysis it is the mean of the studies within the subgroup, and in the regression it is the relevant point on the regression line. Background Q is a measure of dispersion on a standardized scale. We use Q to address the question “Do all studies share a common effect size, Y”. To understand how Q works, consider the following. Suppose that we have a single study and want to ask, “Is the true effect size for this study equal to Y?” We could compute a Z-value for the study using Z= X −Y . SE X (1.34) where X is the observed effect size, Y is the predicted effect size, and SEX is the standard error of the effect size for this study. To test the hypothesis that the true effect for the study is Y, and that the observed deviation (X−Y) is due to sampling error, we could compare the observed Z to the Z distribution. If the observed Z exceeds 1.96 (or −1.96) we might conclude that the true effect size is probably not Y. Alternatively, we could have performed the same analysis by working with Z2. The square of Z is called Q, and the observed Q value is evaluated with reference to the chi-squared distribution with 1 df. If the observed Q exceeds 3.84 (that is, 1.962) we might conclude that the true effect size is probably not Y. In this example we wanted to test the hypothesis that the true value for one study is Y. The same logic can be extended to multiple studies, and this is exactly what happens when we test the assumption of homogeneity. For example, suppose that the meta-analysis includes five studies, and we want to test the assumption of homogeneity, that the true effect size for each of the five studies is Y. We compute Z2 for each study and sum these values over all studies to yield Q. We then evaluate Q with reference to the chi-squared distribution with df equal to the number of studies minus one (here, df = 4) We’ve elected to present Q in this way in order to highlight its relationship to Z. Researchers understand that Z is on a standardized scale, and that, if all studies share a common true effect size, 302 then observed effect sizes with deviations of 1.96 or 2.58 on this scale occur in 5% or 1% of cases, respectively. From this perspective, it’s easy to understand how Q can serve the same function. In most texts (including this one) Q is also described as a “Weighted sum of squares”. This description highlights the fact that the deviations are weighted, and emphasizes the fact that the deviation of X from Y (the observed effect from the predicted effect) gets more weight in a more precise study (typically a larger study) than it does in a less precise (smaller) study. To compute a weighted sum of squares we would compute each deviation and then weight it by the inverse variance. Thus, for any single study Q = ( X − Y )2 × 1 V (1.35) That is, we start with the deviation (squared) and then weight that value by the inverse of the withinstudy variance. This equation (1.35) highlights the fact that studies with lower variance (typically the larger studies) are assigned greater weight. The earlier equation (1.34) highlighted the fact that the Q-values are on a standardized scale (squared). But in fact, the two formulas are algebraically equivalent. To see this, note that (1.35) is in squared values. If we take the square root for both sides of the equation we get Z = ( X −Y )× or Z= 1 SE X ( X −Y ) . SE X (1.36) (1.37) which is identical to (1.34). In either case (whether we use (1.34) or (1.35)) we need to sum the Q values over all studies. Hence the term “WSS”, or weighted sum of squares. Once we have computed Q, we can use it for a number of purposes. First, we can test the Q value for statistical significance. If we are working with the fixed-effect model, the model requires that all studies share a common effect size. If the Q-value is statistically significant, we conclude that the studies do not share a common effect size, and this assumption has been violated. We probably should not be using the fixed-effect model. 303 Figure 227 | Flowchart showing how T2 and I2 are derived from Q Second, we can use Q to estimate the amount of between-study variance and related indexes. Since Q follows the chi-squared distribution, the expected value of Q is equal to df. It follows that if the observed value of Q exceeds the df, the excess can be attributed to between-study variance. This value (Q – df), is the basis for the following. Variance and standard deviation of true effects To compute T2, the between-studies variance, we start with (Q – df). We multiply this difference by a factor (called C) that puts it into the metric of the effect size, squared. If T2 is the variance of effect sizes between studies then its square root, T, is the standard deviation of effect sizes between studies. Ratio of observed variance to true variance To compute I2, the ratio of true to total variance, we start with (Q – df). If Q reflects the total WSS and (Q – df) reflects the WSS between studies, then the ratio (Q – df)/Q is by definition the ratio of true/total variance, called I2. By convention we multiple I2 by 100 and express it as a percentage (0% to 100%). Proportion of true variance explained by the predictors In primary studies, R2 is the proportion of variance explained by predictors. In meta-regression the analog to R2 is the proportion of true variance explained by predictors, based on the explained variance as a proportion of the original variance. Important It is important to note that Q is computed in exactly the same way (using weights based solely on withinstudy variance) regardless of whether we are working with a fixed-effect or a random-effects model. The only difference between the models is in how we interpret Q and the associated statistics. 304 • Under the fixed-effect model, Q addresses the statistical model. If Q is statistically significant the statistical model is not valid. • Under the random-effects model, Q addresses the prediction model. If Q is statistically significant the prediction model is incomplete, meaning that some of the true variance in effects is unexplained. All of these statistics are based on Q, which is based on the deviation of observed from predicted effects. Below, we show how this works for the three general cases we called (A) one group of studies, (B) two subgroups of studies, and (C) a regression. The key point we want to emphasize with these examples is that these cases are all fundamentally the same. Case A The computation of Q and derivative statistics for the simple case (the overall mean) is shown here First, we run the regression using fixed-effect weights Figure 228. Figure 228 | Case-A | Main results | Fixed-effect weights This yields the prediction equation Y = −0.4303 . (1.38) 305 Figure 229 | Case-A | Computing Q In Figure 229 we use (1.38) to compute the weighted squared deviation for each study and sum these to yield Q, which is 152.2329 (allowing for rounding error). Working with Q we then compute T2, the variance of true effect sizes in the sample. It is given by = T2 Q − df 152.2329 − 12 = = 0.3088 454.1808 C (1.39) In this equation df is the number of studies minus the number of predictors (the intercept), and C is a conversion factor that allows us to move from the standardized scale of Q to the metric of the effect size. This estimate of T2 is then employed to assign random-effect weights to all studies and re-run the regression. The result is shown in Figure 230, where the prediction equation is Y = −0.7141 . (1.40) 306 Figure 230 | Case-A | Main results |Random-effects The standard deviation of true effect sizes, T, given by = T = T2 0.3088 = 0.5557 (1.41) If we assume that the true effects are normally distributed about their predicted value (and that the predicted value is correct), then 95% of all studies would have true effects in the range given the predicted value plus/minus 1.96 T. Under the random-effects model the predicted effect for all studies is −0.714, and so the true effects would fall in the range of LL = −0.7141 − 1.96 × 0.5557 = −1.8032 (1.42) UL = −0.7141 + 1.96 × 0.5557 = 0.3750 (1.43) These are the values we use to create the graphic. The normal curve is centered at −0.7141 and extends roughly from 0.3750 to −1.8032, intended to reflect the true effect size in some 95% of relevant studies. (This simplified example assumes that the mean is known. To actually compute prediction intervals we would take account of the error in estimating the mean.) 307 M L N Figure 231 | Case-A | Dispersion of effects about regression line We can also compute I2, the ratio of true variance to total variance. This is given by = I2 Q − df 152.33 − 12 = = 92.12% Q 152.33 (1.44) Typically we can also compute R2, the proportion of variance explained by the covariates. In this case there are no covariates, and so this is not applicable. 308 Case B When we turn to case B (two subgroups) all of the formulas remain the same. The only thing that changes is that the predicted value is now different for studies in each subgroup. (Also, the conversion factor C is based on a different formula). First, we run the regression using fixed-effect weights Figure 228. Figure 232 | Case-B | Main results | Fixed-effect weights This yields the prediction equation Y= −0.9986 + 0.8870 × Hot , (1.45) where Hot is 0 for cold climates and 1 for hot climates. 309 Figure 233 | Case-B | Computing Q In Figure 229 we use (1.38) to compute the weighted squared deviation for each study and sum these to yield Q, which is 152.2329 (allowing for rounding error). Working with Q we then compute T2, the variance of true effect sizes in the sample. It is given by T2 = 41.7894 − 11 Q − df = = 0.0964 319.4579 C (1.46) In this equation df is the number of studies minus the number of predictors (the intercept and climate), and C is a conversion factor that allows us to move from the standardized scale of Q to the metric of the effect size. This estimate of T2 is then employed to assign random-effect weights to all studies and rerun the regression. 310 Figure 234 | Case-B | Main results |Random-effects The result is shown in Figure 234, where the prediction equation for cold studies is Y =−1.1987 + 0 × 0.9203 =−1.1987 (1.47) Y =−1.1987 + 1 × 0.9203 =−0.1116 (1.48) and for hot studies is The standard deviation of true effect sizes, T, given by = T = T2 0.0964 = 0.3105 (1.49) If we assume that the true effects are normally distributed about their predicted value (and that the predicted value is correct), then 95% of all studies would have true effects in the range given the predicted value plus/minus 1.96 T. For cold climates, 95% of true effects would fall in the range of LL = −0.9986 − 1.96 × 0.3105 = −1.6071 (1.50) UL = −0.9986 − 1.96 × 0.3105 = −0.3901 (1.51) 311 For hot climates, 95% of true effects would fall in the range of LL = −0.1116 − 1.96 × 0.3105 = −0.7201 (1.52) UL = −0.1116 + 1.96 × 0.3105 = +0.4969 (1.53) These are the values we use to create the graphic shown in Figure 237. N O Figure 235 | Case-B | Dispersion of effects about regression line For cold studies the normal curve is centered at −0.9996 and extends roughly from −1.6071 to −0.3901, intended to reflect the true effect size in some 95% of relevant studies. For hot studies the normal curve is centered at −0.1116 and extends roughly from −0.7201 to +0.4969, intended to reflect the true effect size in some 95% of relevant studies. (This simplified example assumes that the mean is known. To actually compute prediction intervals we would take account of the error in estimating the mean.) We can also compute I2, the ratio of true variance to total variance. This is given by 2 I= Q − df 41.7894 − 11 = × 100 = 73.68% Q 41.7894 (1.54) We can also compute R2, the proportion of variance explained by the covariates. The unexplained variance with only the intercept is 0.3088 and the unexplained variance with the intercept plus climate is 0.0964, which means that the variance explained by the intercept is 2 TExplained = 0.3088 − 0.0964 = 0.2124 (1.55) Then, R2 is the explained variance as a proportion of the total variance, or 312 2 TExplained 0.2124 = R = = 0.6878 2 TTotal 0.3088 2 (1.56) or .6878. Case C Finally, we can apply exactly the same approach to compute Q for the case where we use latitude to predict the effect size. First, we run the regression using fixed-effect weights (Figure 236). Figure 236 | Case-C | Main results | Fixed-effect weights This yields the prediction equation Y = 0.3436 − 0.0292 × Latitude , (1.57) where Latitude is an absolute value. 313 Figure 237 | Case-C | Computing Q In Figure 237 we use (1.38) to compute the weighted squared deviation for each study and sum these to yield Q, which is 30.7339 (allowing for rounding error). Working with Q we then compute T2, the variance of true effect sizes in the sample. It is given by = T2 Q − df 30.7339 − 11 = = 0.0633 311.7368 C (1.58) In this equation df is the number of studies minus the number of predictors (the intercept and climate), and C is a conversion factor that allows us to move from the standardized scale of Q to the metric of the effect size. This estimate of T2 is then employed to assign random-effect weights to all studies and rerun the regression. 314 Figure 238 | Case-C | Main results |Random-effects The result is shown in Figure 239, where the prediction equation for all studies is Y = 0.2595 − 0.0292 × Latitude (1.59) The standard deviation of true effect sizes, T, given by = T = T2 0.0633 = 0.2516 (1.60) If we assume that the true effects are normally distributed about their predicted value (and that the predicted value is correct), then 95% of all studies would have true effects in the range given the predicted value plus/minus 1.96 T. 315 F G H Figure 239 | Case-C | Dispersion of effects about regression line In the figure we show the range of effects for three arbitrary points on the regression line. For studies at a latitude of 20, the predicted effect size is Y = 0.2595 − 0.0292 × 20 = −0.2404 (1.61) Then, 95% of true effects would fall in the range of LL = −0.2404 − 1.96 × 0.2516 = −0.7335 (1.62) UL = −0.2404 + 1.96 × 0.2516 = 0.2527 (1.63) Thus, for studies at a latitude of 20 the normal curve is centered at −0.2404 and extends roughly from −0.7335 to −0.2527, intended to reflect the true effect size in some 95% of relevant studies. (This simplified example assumes that the mean is known. To actually compute prediction intervals we would take account of the error in estimating the mean.) We can also compute I2, the ratio of true variance to total variance. This is given by 2 I= Q − df 30.72339 − 11 = × 100 = 64.21% Q 30.7339 (1.64) We can also compute R2, the proportion of variance explained by the covariates. The unexplained variance with only the intercept is 0.3088 and the unexplained variance with the intercept plus climate is 0.0964, which means that the variance explained by the intercept is 2 TExplained = 0.3088 − 0.0633 = 0.2455 (1.65) 316 Then, R2 is the explained variance as a proportion of the total variance, or 2 TExplained 0.2455 = R = = 0.7950 2 TTotal 0.3088 2 (1.66) or .7950. 317 APPENDIX 3: TESTS OF HETEROGENEITY In any analysis, whether based on the fixed-effect model or the random-effects model, tests of heterogeneity are based on fixed-effect weights. Our goal in this section is to explain why. Consider Figure 240, which presents results for a simple analysis, with one set of studies. C D Figure 240 | Heterogeneity statistics in basic analysis At the left the program shows the analyses that address the effect size. One line presents results based on fixed-effect weights and the other presents results based on random-effects weights. Since the study weights on the first row are based on V while those on the second row are based on V + T2, the effect size, the standard error, and all other statistics differ from one line to the next. In the section on heterogeneity one might similarly expect to see two sets of statistics, one based on fixed-effect weights and the other on random-effects weights. In fact, though there is only one set of statistics, based on fixed-effect weights, that applies to both statistical models. The reason has to do with the nature of the null hypothesis for heterogeneity, and the Q statistics that we use to test this hypothesis. Recall that Q is the WSS, or the weighed sum of squared deviations. That is, 1. 2. 3. 4. We take the deviation of every effect size from the mean effect size We square that deviation We weight the squared deviation We sum this value across all studies Under the fixed-effect model, In step 1 we compute the deviation of every effect size from −0.4303. In step 3 we weight each squared deviation by 1/V. Computed in this way, Q is 152.2330. With 12 degrees of freedom the corresponding p-value is < 0.0001 Similarly, under the random-effects model, In step 1 we compute the deviation of every effect size from −0.4303. In step 3 we weight each squared deviation by 1/V. Computed in this way, Q is 152.2330. With 12 degrees of freedom the corresponding p-value is < 0.0001. 318 Thus, the test is identical under the two models because under the null there is no variation in true effects, which means that T2 is zero. It follows that − 1. In step 1, when we compute the deviation of each effect size from the mean effect size, the mean effect size we need to use is −0.4303 (the fixed-effect estimate) rather than −0.7141 (the random-effects estimate). If T2 is zero, the weighted mean would be −0.4303 2. In step 3, when we weight each squared deviation to get the weighted sum of squares (Q), the weights must be based on V. Again if T2 is zero then it has no impact on the weights Thus, the expected variation in effects under the null (that T2 is zero) is identical whether we are using the fixed-effect model or the random-effects model. Therefore, the test of this null is identical under both models. If it still seems odd that we’re using fixed-effect weights to test the null for a random-effects model, this might help. Rather than thinking of the test as using fixed-effect weights based on V, think of it as using random-effects weights based on V+ T2, where (under the null) T2 happens to be zero. Of course, the same logic holds if we use a regression approach. Figure 241 | Heterogeneity statistics in regression 319 Figure 242 | Heterogeneity statistics in regression Figure 259 and Figure 260 show results for a fixed-effect analysis and a random-effects analysis using the same data as Figure 240. • • In the fixed-effect analysis the residual Q is 152.2330 with df = 12 and p < 0.0001. In the random-effects analysis the Goodness of fit test shows Q is 152.2330 with df = 12 and p < 0.0001. These numbers are identical to each other. While the test for heterogeneity is identical under the FE and the RE models, the way that we use the test results depends on the model. • Under the FE model a significant p-value for heterogeneity tells us that the data are not consistent with the model. There is no reason to compute T2 or I2, since these are assumed to be zero. For this reason, regardless of whether or the p-value is statistically significant Figure 241 shows no estimate of T2 nor of I2. • Under the RE model a significant p-value tells us that there is empirical evidence of heterogeneity, while a non-significant p-value tells us that this evidence is not present. Whether or not the p-value is statistically significant, we proceed to compute T2 which is incorporated into the weighting scheme. For this reason, regardless of whether or the p-value is statistically significant, in Figure 260 the line that displays Q also displays an estimate of T2 and I2. To this point we’ve established that in a simple analysis, the test of the null that T2 is zero is identical for fixed-effect and for random-effects models. The same logic applies when we move on to more complex analyses, such as analyses that involve subgroups or continuous covariates. Figure 261 shows an analysis to assess the relationship between effect size and climate 320 As was true for the simple analysis, hypothesis tests that involve the relationship between subgroups and climate do depend on the statistical model. Under the fixed-effect model the difference in subgroup means is based on computations where studies are weighted by 1/V. The Q-value is 110.4436, with 1 df and p < 0.0001. Under the random-effects model the difference in subgroup means is based on computations where studies are weighted by 1/V+ T2. The Q-value is 15.5445, with 1 df and p = 0.0001. By contrast, when we turn to the question of heterogeneity within subgroups, the program presents only one set of statistics. As before, these statistics are based on Fixed-effect weights and are displayed in the fixed-effect section. Figure 243 | Heterogeneity statistics with subgroups The null hypothesis for heterogeneity is that studies within subgroups share the same effect size. To test the null that T2 is zero we compute Q and df within subgroups and sum these values across subgroups. In this case, for the Cold and Hot subgroups Q is 20.3464 plus 21.4431 for a total of 41.7894, while degrees of freedom are 5 plus 6 for a total of 11. A Q value of 41.7894 with 11 df yields a p-value of < 0.0001, and we conclude that there probably is some variation in true effects, even within subgroups. As was true for the simple analysis, the computation of Q is identical under the FE and the RE models since in both cases T2 under the null is zero. When T2 is zero the mean effect size in each group is based on study weights of V rather than V + T2, and the weights assigned to compute the weighed sum of squares are based on V rather than V + T2. To this point we’ve shown that in a simple analysis and also in a subgroups analysis, the test of homogeneity is identical for the fixed-effect and the random-effects models. The same applies also to a regression analysis with continuous covariates. Figure 262 and Figure 263 show a regression analysis using latitude as a covariate, for the fixed-effect model and the random-effects model, respectively. 321 Under the null hypothesis for homogeneity, T2 is zero for all studies with the same predicted effect size (here for studies at the same latitude). And, as before, this if the null is true then study weights are based on 1/V for both models. This means that under the null hypothesis for heterogeneity, the regression line for the RE model will be identical to the regression line under the FE model and so the squared deviation of each study from the regression line will be the same. And, the weights employed to compute WSS will be the same under the two models. So, the Q value will be identical under the two models. For that reason the statistics for the residual Q in Figure 244 and the statistics for Goodness of fit in Figure 263 are identical to each other. Figure 244 | Heterogeneity statistics with subgroups 322 Figure 245 | Heterogeneity statistics with subgroups 323 Figure 246 | Heterogeneity statistics with continuous covariate 324 Figure 247 | Heterogeneity statistics with continuous covariate To avoid confusion we should emphasize that this section deals only with the issue of heterogeneity, and not with the issue of effect sizes. • When we are focusing on the effect size itself, the test of the null does depend on the statistical model. Under the FE model we pose the null that the effect size is zero and test this null assuming a variance of V. Under the RE model we pose the null that the effect size is zero, and test this null assuming a variance of V + T2. (The same idea carries through to the more complex cases). • By contrast, when we are focusing on T2 itself, as we do in this chapter, the test of the null is the test that T2 is zero, and as such is identical for both statistical models. Finally, while the computation of Q is identical for the FE and the RE model, the implications of the Q statistic, and what we do with this statistic, do depend on our assumptions about the statistical model. 325 As explained earlier, the Q-value can be used to test the null hypothesis for statistical significance. It can also be used to compute T2 (the variance of true effect sizes) and to compute I2 (the proportion of observed variance that represent differences in real effects rather than sampling error). If we have adopted the fixed-effect model, a significant Q-value tells us that the data are not consistent with the model. Whether or not the Q-value is statistically significant we do not compute T2 nor I2, since under the FE model these are defined as being zero. If we have adopted the random-effects model, then the fact that Q is (or is not) statistically significant has little real bearing. The model allows for variance in true effects and is valid whether or not this variance exists. Whether or not the Q-value is statistically significant we do compute T2 (which is employed to assign weights) and I2 (which helps us to describe the distribution of effects). 326 APPENDIX 4: COMPUTING Τ2 IN THE PRESENCE OF SUBGROUPS When we’re working with subgroups, it’s clear that we need to estimate the between-study variance T2 within subgroups rather than for the full set of studies. In our example, when we’re estimating the mean effect for Hot studies and for Cold studies, the between-study variance that we need to assign weights and to discuss the unexplained variance is clearly the variance within subgroups. However, there are two ways to estimate τ2 within subgroups. One option is to compute one estimate of τ2 for the Hot studies, and a separate estimate for the Cold studies. Then, we would use each estimate for the corresponding set of studies. The other option is to compute one estimate of τ2 for the Hot studies and a separate estimate for the Cold studies. Then we would pool the two estimates and use the pooled estimate for both sets of studies. The logic for choosing the second option is that estimate of τ2 are not reliable unless they’re based on a large number of studies. Unless we have good reason to believe that τ2 will differ substantially from one subgroup to the next, it’s often a better idea to assume that the true value of τ2 is comparable for each subgroup, and we’ll get a better estimate of this common value by pooling the within-subgroups estimates. This is the option that we use with meta-regression since (at least when we’re using continuous covariates) the first option is not tenable. For purposes of comparing the subgroups analyses with the regression, we must select this option for subgroups as well. On the analysis screen • • • Select Computational options > Mixed and random effects options Select the option to Assume a common among-study variance across subgroups Select the option to Combine subgroup using a fixed-effect model Figure 248 | Computing τ2 in the presence of subgroups 327 Figure 249 | Computing τ2 in the presence of subgroups Figure 250 | Computing τ2 in the presence of subgroups In this example we would pool the within-subgroup estimates of T2 and apply the pooled estimate to both subgroups. The program does not display the pooled estimate on this screen. To see the pooled estimate click Next table and Calculations. The pooled estimate is 0.0964. 328 Figure 251 | Computing τ2 in the presence of subgroups For those who are interested, we show how to actually pool the estimates of T2 to get the pooled estimate, and how to display the pooled estimate. The mechanism for pooling is not to take the mean of the two estimates, but rather to pool the underlying statistics and then computed T2 from the combined data. Recall that Q − df C (1.67) ∑ Q − ∑ df ∑C (1.68) T2 = and so to compute a pooled value we use T2 = where values are summed across all subgroups. To get the within-subgroup values prior to pooling • • • Select “Do not assume a common among-study variance component”. Click Fixed Click Calculations 329 In Figure 252 • • • Column C shows within-subgroup values of 110.9413 and 208.5166 Column Q shows within-subgroup values of 20.3464 and 21.4431 Column Q df shows within-subgroup values of 5 plus 6 C Q df Cold 110.9413 20.3464 5 Hot 208.5166 21.4431 6 Sum 319.4579 41.7895 11 Figure 252 | Computing τ2 in the presence of subgroups = T2 41.7895 − 11 = 0.0964 , 319.4579 (1.69) which is the same value we saw in Figure 251. Be sure to switch the option back to “Assume a common value” and to re-select the tab for Random. 330 APPENDIX 5: CREATING VARIABLES FOR INTERACTIONS See Part 1: Data files and downloads for location of files In general, it’s a good idea to create all the variables you’ll need on the data-entry screen before proceeding to the analysis. However, if you’re in the middle of the analyses and discover that you need to create additional variables, you can easily return to the data-entry screen to do so. In some cases it’s easiest to enter data for the new variables manually. This may be the case if you have only a few studies and the computation of the new data points is simple (for example, each study is coded 0 or 1). In other cases, however, it’s easier to copy the data out to Excel, create the data for the new variables, and then copy the data back into CMA. We show that process here. Suppose we’re working with the data set shown in Figure 253, which includes the variable Latitude. At some point we realize that we need a variable called Latitude-C (centered) and one called Latitude-C2 (centered, then squared) for the analyses. Figure 253 | Creating variables for interactions 331 • Return to this screen. Figure 254 | Creating variables for interactions Insert a column called Latitude-C and define it as Moderator > Decimal Insert a column called Latitude-C2 and define it as Moderator > Decimal Figure 255 | Creating variables for interactions • • Click on the Latitude Column Click Edit > Copy with Header 332 Open Excel™ • • • • • Paste the column into Column A Define Cell A18 as =AVERAGE(A3:A15) Define Cell B3 as =A3-$A$18 and copy to other rows Define Cell C3 as =B3^2 and copy to other rows Copy columns B and C (rows 3 to 15) to the clipboard Figure 256 | Creating variables for interactions Return to CMA • • • Click on Row-1 in the Latitude-C column Click CTRL-V to paste the data Save the file with the new data Figure 257 | Creating variables for interactions 333 APPENDIX 6: PLOTTING A CURVILINEAR RELATIONSHIP The spreadsheet used in this example is [Plot of curvilinear relationship.xlsx] See Part 1: Data files and downloads for location of files In chapter Part 14: Interaction we showed how to use Latitude and Latitude2 to predict effect size. The program will not plot a curvilinear relationship, so we need to create the plot in Excel™. In this example we used Latitude-C (centered) and Latitude-C2 as covariates. Figure 258 shows the results of the analysis. To create the plot we need the coefficients [A] from this figure. A Figure 258 | Plotting a curvilinear relationship In Excel™, create columns as shown in Table 11 and Figure 259. Table 11 Column Column A is a constant Column B is the intercept (B0) Column C is Latitude-C Column D is coefficient for Latitude-C Column E is Latitude-C, squared Column F is coefficient for Latitude-C2 Column G is blank Column H is latitude, un-centered Column I is the predicted effect size Comment All studies have a value of 1 The value of −0.5501 comes from Figure 258 Values range from −20 to +20 The value of −0.0316 comes from Figure 258 This is the square of column C The value of −0.0008 comes from Figure 258 This is column C plus the mean latitude, 33.46 For row 3 this is =A3*B3+C3*D3+E3*F3 334 Figure 259 | Plotting a curvilinear relationship In rows 3-43 we enter data for a sequence of possible studies. There are 41 rows, where the value of Latitude-C (centered) ranges from −20 to +20. Note that these are not the actual studies in our analysis, but rather the approximate range of values for Latitude-C in our analysis. 335 At this point column H holds the latitudes while Column I holds the predicted values. We can use these two columns to create a plot. The instructions here are for Microsoft Excel™ 2010. The specific commands may vary slightly for other versions. • • • Highlight the two columns from row 3 to the bottom (Figure 260) Select Insert > Scatter > Scatter with smooth lines The program creates this plot (Figure 261) Figure 260 | Plotting a curvilinear relationship Figure 261 | Plotting a curvilinear relationship 336 You may want to customize the graph as follows Layout Legend None Primary vertical axis Primary vertical axis More options > Vertical axis crosses > Axis value −1.8 More options > Number > Number > Decimal places > 2 Axes Chart title Effect size as a function of Latitude and Latitude2 Above chart Axis title Primary horizontal axis title Primary vertical axis title Title below Axis > Latitude Rotated Title > Log Risk Ratio At this point the plot should look like Figure 262 Log risk ratio Effect size as a function of Latitude and Latitude2 0.00 -0.20 -0.40 -0.60 -0.80 -1.00 -1.20 -1.40 -1.60 0 10 20 30 40 50 60 Latitude Figure 262 | Plotting a curvilinear relationship Notes The key to plotting a non-linear relationship is that we create a row for every value of Latitude-C. Then we computed the corresponding value of the square, and the predicted value. The same idea can be extended for higher-order relationships as well. We could have plotted effect size as a function of Latitude-C, in which case the X-axis would run from −20 to +20. Instead, we un-centered the predictor (adding the mean latitude to each centered value), and so the axis runs from 0 to 60. Critically, this edit takes place only at the last step. The prediction equations are based on the centered scores. 337 APPENDIX 7: PLOTTING INTERACTIONS In this appendix we show how to plot three kinds of interactions • • • The interaction of two categorical covariates The interaction of one categorical and one continuous covariate The interaction of two continuous covariates All three follow basically the same format in Excel™. The only important difference is that for the first two we select the Line graph, and for the third we select the Scatter graph. 338 Plotting the interaction of two categorical covariates In chapter we ran an analysis to assess the interaction of Climate by Time. Here, we show how to plot this interaction using Excel™. See Part 1: Data files and downloads for location of files The spreadsheet used in this example is [Plot of hot x time.xlsx] The original data set includes the covariates Latitude and Year, both of which are continuous. For purposes of this discussion we need two categorical covariates, and so we create them by dichotomizing Latitude and Year. • • • Hot is coded 1 if the latitude is 34 or less, and is coded 0 if the latitude is exceeds 34. Recent is coded 1 if the Year is 1945 or later, and is coded 0 if the Year is earlier than 1945. Hot x Recent is created by multiplying Hot x Recent. The results of the analysis are shown in Figure 264. To create the plot we’ll need the covariate names and coefficients from this figure. Copy these to Excel™ columns B and C as shown in Figure 265. Figure 263 | Plotting interaction of two categorical covariates 339 Figure 264 | Plotting interaction of two categorical covariates Figure 265 | Plotting interaction of two categorical covariates To create the plot we need two points (Early and Recent) for the Cold studies • • Column E: Cold and Early. The X values are Hot (0), Recent (0), Interaction (0). Column F: Cold and Recent. The X values are Hot (0), Recent (1), Interaction (0). To create the plot we need two points (Early and Recent) for the Hot studies • • Column H: Hot and Early. The X values are Hot (1), Recent (0), Interaction (0). Column I: Hot and Recent. The X values are Hot (1), Recent (1), Interaction (1). 340 The predicted value for each column (E, F, H, I) is given by multiplying each X value by the corresponding coefficient in column C, and then summing across the four rows. For example, the formula for E10 is =E5*$C$5+E6*$C$6+E7*$C$7+E8*$C$8 To create the plot we need to identify the cells E10 and F10 as endpoints for the Cold studies, and H10 and I10 as endpoints for the Hot studies. Insert > Graph > Line > With markers Design > Select data > Add Series Series name Series X-values Series Y-Values H2 H3:I3 H10:I10 Hot Early, Recent −0.2164, −0.3035 Design > Select data > Add Series Series name Series X-values Series Y-Values E2 E3:F3 E10:F10 Cold Early, Recent −1.1154, −1.4416 Layout > Axis Primary vertical axis Primary vertical axis Primary vertical axis Primary vertical axis Axis options > Minimum > Fixed > −2.0 Axis options > Maximum > Fixed > 0.5 Horizontal Axis Crosses > −2.0 Number > Number > Decimal places > 2 Layout > Chart Title Above chart Log risk ratio as a function of climate, year, interaction Format Series > Hot Series > Cold Format selection > Line Style > Dash type > Solid Format selection > Line Style > Dash type > Dashed 341 Figure 266 | Plotting interaction of two categorical covariates 342 Plotting the interaction of a categorical covariate by a continuous covariate The spreadsheet used in this example is [Plot of hot x year-c.xlsx] In Part 14: Interaction we ran an analysis to assess the interaction of Year-C by Climate. Since Climate is a categorical variable, the actual analysis covariates are Year-C and Hot, and the interaction is Year-C by Hot. Following our convention, Hot is coded 0 for Cold studies and 1 for Hot studies. The interaction is the product of the two. The results of the analysis are shown in Figure 267 To create the plot we’ll need the covariate names and coefficients fromFigure 267. In Excel™, copy these to columns B and C as shown in Figure 268. Figure 267 | Plotting interaction of categorical by continuous covariates 343 Figure 268 | Plotting interaction of categorical by continuous covariates To create the plot we need two points (1933 and 1968) for the Cold studies • • Column E: Cold and 1933. The X values are Hot (0), Year-C (−15), Interaction (0). Column F: Cold and 1968. The X values are Hot (0), Recent (20), Interaction (0). To create the plot we need two points (1933 and 1968) for the Hot studies • • Column H: Hot and 1933. The X values are Hot (1), Year-C (−15), Interaction (−15). Column I: Hot and 1968. The X values are Hot (1), Year-C (20), Interaction (20). The predicted value for each column (E, F, H, I) is given by multiplying each X value by the corresponding coefficient in column C, and then summing across the four rows. For example, the formula for E10 is =E5*$C$5+E6*$C$6+E7*$C$7+E8*$C$8 344 Note The columns are labeled 1933 and 1968, since these are the values we want to display in the plot. Critically, these are not the values we use to compute the effect size. Rather, the X values in the data columns are the corresponding centered values, −15 and 20. To create the plot we need to identify the cells E10 and F10 as endpoints for the Cold studies, and H10 and I10 as endpoints for the Hot studies. Insert > Graph > Line > With markers Design > Select data > Add Series Series name Series X-values Series Y-Values H2 H3:I3 H10:I10 Hot 1933, 1968 −0.0275, −0.3929 Design > Select data > Add Series Series name Series X-values Series Y-Values E2 E3:F3 E10:F10 Cold 1933, 1968 −1.0184, −1.9286 Layout > Axis Primary vertical axis Primary vertical axis Primary vertical axis Primary vertical axis Axis options > Minimum > Fixed > −2.5 Axis options > Maximum > Fixed > 0.5 Horizontal Axis Crosses > −2.5 Number > Number > Decimal places > 2 Layout > Axis Primary horizontal axis Primary horizontal axis Primary horizontal axis Primary horizontal axis Axis options > Minimum> Fixed > 1930 Axis options > Maximum > Fixed > 1970 Vertical Axis Crosses > 1930 Number > 0 Layout > Chart Title Above chart Log risk ratio as a function of climate, year, interaction Format Series > Hot Series > Cold Format selection > Line Style > Dash type > Solid Format selection > Line Style > Dash type > Dashed 345 Figure 269 | Plotting interaction of categorical by continuous covariates 346 Plotting the interaction of two continuous covariates In this example we assess the impact of Year-C, Latitude-C, and Year-C x Latitude-C. See Part 1: Data files and downloads for location of files The spreadsheet used in this example is [Plot of latitude-c x year-c.xlsx] • • • Year-C is the study year, centered Latitude-C is the latitude, centered Year-C x Latitude-C is the interaction The results of the analysis are shown in Figure 270 To create the plot we’ll need the covariate names and coefficients from Figure 270. In Excel™, copy these to columns B and C as shown in Figure 271. 347 A B C Figure 270 | Plotting interaction of continuous covariates Figure 271 | Plotting interaction of continuous covariates 348 We will plot effect size for three latitudes (13, 33, 55). For each of these latitudes, we need the effect size at two years (1933, 1968) To create the plot we need two points (1933 and 1968) for Latitude 13 • • Column E: Latitude=13, 1933. The X values are Latitude-C (−20), Year-C (−15), Interaction (300). Column F: Latitude=13, 1968. The X values are Latitude-C (−20), Year-C (20), Interaction (-400). To create the plot we need two points (1933 and 1968) for Latitude 33 • • Column E: Latitude=33, 1933. The X values are Latitude-C (0), Year-C (−15), Interaction (0). Column F: Latitude=33, 1968. The X values are Latitude-C (0), Year-C (20), Interaction (0). To create the plot we need two points (1933 and 1968) for Latitude 55 • • Column E: Latitude=55, 1933. The X values are Latitude-C (21), Year-C (−15), Interaction (-315). Column F: Latitude=55, 1968. The X values are Latitude-C (21), Year-C (20), Interaction (420). The predicted value for each column (E, F, H, I,K,L) is given by multiplying each X value by the corresponding coefficient in column C, and then summing across the four rows. For example, the formula for E10 is =E5*$C$5+E6*$C$6+E7*$C$7+E8*$C$8 Note The columns are labeled with the years 1933 and 1968, since these are the values we want to display in the plot. Critically, these are not the values we use to compute the effect size. Rather, the X values in the data columns are the corresponding centered values, −15 and 20. Similarly, the columns are labeled with the latitudes 13, 33, and 55, since these are the values we want to display in the plot. Critically, these are not the values we use to compute the effect size. Rather, the X values in the data columns are the corresponding centered values, −20, 0, and 21. To create the plot we need to identify the cells E10 and F10 as endpoints for the Latitude=13 studies, H10 and I10 as endpoints for the Latitude=33 studies, and K10 and L10 as endpoints for the Latitude=55 studies. 349 Insert > Scatter > Smooth lines Design > Select data > Add Series Series name E2 Series X-values E3:F3 Series Y-Values E10:F10 Design > Select data > Add Series Series name H2 Series X-values H3:I3 Series Y-Values H10:I10 Design > Select data > Add Series Series name K2 Series X-values K3:L3 Series Y-Values K10:L10 Lat=13 1933, 1968 0.1318,−0.2796 Lat=33 1933, 1968 −0.6171,−0.7835 Lat=55 1933, 1968 −1.4035,−1.3126 Layout > Axis Primary vertical axis Primary vertical axis Primary vertical axis Primary vertical axis Axis options > Minimum > Fixed > −2.5 Axis options > Maximum > Fixed > 0.5 Horizontal Axis Crosses > −2.5 Number > Number > Decimal places > 2 Layout > Axis Primary horizontal axis Primary horizontal axis Primary horizontal axis Primary horizontal axis Primary horizontal axis Axis options > Minimum> Fixed > 1930 Axis options > Maximum > Fixed > 1970 Vertical Axis Crosses > 1930 Number > 0 Number > Use 1000 separator > No Layout > Chart Title Above chart Log risk ratio as a function of climate, year, interaction Format Series > Lat=13 Series > Lat=33 Series > Lat=55 Format selection > Line Style > Dash type > Solid Format selection > Line Style > Dash type > Dashed Format selection > Line Style > Dash type > Dashed 350 Figure 272 | Plotting interaction of continuous covariates 351 APPENDIX 8: INTERPRETING REGRESSION COEFFICIENTS Here, we are working with studies where the treatment is associated with lower risk. Given the coding, most risk ratios are less than 1.0 and (equivalently) most log risk ratios are negative. Therefore, a negative coefficient implies that a higher score is associated with a lower risk (the log risk ratio becomes more negative, and the risk ratio moves further from 1.0). By contrast, if we were working with studies where the risk tended to be greater than 1 (for example the “risk” of living longer) then the log risk ratio would tend to be positive. Then, 0 is no effect, +1 is a large effect, and +2 is a very large effect. Therefore, a positive coefficients means that as the covariate gets larger the vaccine is more effective. The coefficient for Year is 0.0235, which means that for every increase of one year the log risk ratio will increase by 0.0235 (the vaccine became less effective over time). The corresponding p-value is 0.1390. 352 APPENDIX 9: META-REGRESSION IN STATA In this section we show the correspondence between results produced by CMA and those produced by the stata macro “metareg” 353 Figure 273 | CMA | Intercept + Year + Latitude + Allocation | Z | Method of moments Figure 274 | Metareg| Intercept + Year + Latitude + Allocation | Z | Method of moments 354 Figure 275 | CMA | Allocation | Z | Method of moments Figure 276 | Metareg | Allocation | Z | Method of moments 355 Figure 277 | CMA | Allocation, Year | Z | Method of moments Figure 278 | Metareg | Allocation, Year | Z | Method of moments 356 Figure 279 | CMA | Intercept, Year-C, Year-C2 | Z | Method of moments Figure 280 | Metareg | Intercept, Year-C, Year-C2 | Z | Method of moments 357 REFERENCES Berkey, C.S., Hoaglin, D.C. Mosteller, F., and Colditz, G.A. (1995) A random-effects regression model for meta-analysis. Statistics in Medicine, 14, 395-411. Borenstein, M., Hedges, L.V., Higgins, J.P.T., Rothstein, H. (2009) Introduction to Meta-Analysis. Chichester: Wiley. Cohen J., Cohen P., West S.G., Aiken, L.S. (2003) Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences 3rd Edition. Mahwah: Lawrence Erlbaum Associates Colditz, G.A., Brewer, F.B., Berkey, C.S., Wilson, E.M., Burdick, E., Gineberg, H.V., and Mosteller, F. (1994). Efficacy of BGC vaccine in the prevention of tuberculosis. Journal of the American Medical Association 271, 698-702. Egger. M, Smith, G.W., Altman, D.G. (2001) Systematic Reviews in Health Care: Meta-Analysis in Context. (2nd Edition) London: BMJ Books Hartung, J., Knapp, G., Sinha, B.K., (2008) Statistical Meta-Analysis with Applications. Hoboken: Wiley. Hedges, L.V., and Olkin I. (1985) Statistical Methods for Meta-Analysis. Boston: Academic Press. Hedges, L. and Pigott, T.D. (2001) The power of statistical tests in meta-analysis. Psychological Methods 6, 203-217. Hedges, L. and Pigott, T.D. (2004) The power of statistical tests for moderators in meta-analysis. Psychological Methods 9, 426-445. Higgins, J.P.T., and Thompson, S.G. (2004) Controlling the risk of spurious findings from meta-regression. Statistics in Medicine, 23, 1663-1682. Rothman, K.J. (1990). No Adjustments are Needed for Multiple Comparisons, Epidemiology, 1, 43-46. Sutton, A.J., Abrams, K.R., Jones, D.R., Song, F. (2000) Methods for Meta-Analysis in Medical Research. Chichester: John Wiley and Sons 358

Download PDF

advertisement