# Regression in Meta-Analysis - Comprehensive Meta

```Equation Chapter 1 Section 1
Regression in Meta-Analysis
Michael Borenstein
Larry V. Hedges
Julian P.T. Higgins
Hannah Rothstein
Draft – Please do not quote
This draft released January 6, 2015
An updated copy of this manual will be posted at
http://www.meta-analysis.com/pages/cma_manual.php
1
Part 2: Overview of Meta-regression .......................................................................................................... 14
Part 3: The BCG example ............................................................................................................................ 15
Part 4: Meta-regression is observational .................................................................................................... 19
Part 5: Fixed-effect vs. Random-effects ...................................................................................................... 21
Putting regression in context .................................................................................................................. 29
Fixed-effect model .................................................................................................................................. 30
Basic analysis (Case A)......................................................................................................................... 30
Test of effect size ........................................................................................................................ 31
Test of the statistical model ........................................................................................................ 31
The regression approach................................................................................................................. 31
Test of effect size ........................................................................................................................ 32
Analysis of variance..................................................................................................................... 32
Summary ......................................................................................................................................... 33
Subgroups analysis (Case B) ................................................................................................................ 33
Is the common effect size zero for each subgroup? ................................................................... 34
Analysis of variance..................................................................................................................... 35
The regression approach................................................................................................................. 36
Analysis of variance..................................................................................................................... 37
Summary ......................................................................................................................................... 39
Continuous covariate (Case C) ............................................................................................................ 39
The regression approach................................................................................................................. 39
Analysis of variance..................................................................................................................... 40
Summary ......................................................................................................................................... 41
In context ............................................................................................................................................ 41
Random-effects model............................................................................................................................ 42
Basic analysis (Case A)......................................................................................................................... 42
Test of effect size ........................................................................................................................ 43
2
Heterogeneity ............................................................................................................................. 43
The regression approach................................................................................................................. 44
Test of effect size ........................................................................................................................ 45
Test of the model ........................................................................................................................ 45
Goodness of fit ............................................................................................................................ 46
Comparison of Model 1 with the null model .............................................................................. 47
Summary ......................................................................................................................................... 48
Subgroups analysis (Case B) ................................................................................................................ 48
Note on computing T2 ................................................................................................................. 50
Is the mean effect size zero for each subgroup? ........................................................................ 51
Test of the model ........................................................................................................................ 51
Heterogeneity ............................................................................................................................. 51
The regression approach................................................................................................................. 52
Test of the model ........................................................................................................................ 54
Goodness of Fit ........................................................................................................................... 54
Comparison of Model 1 with the null model .............................................................................. 56
Summary ......................................................................................................................................... 58
Continuous covariate (Case C) ............................................................................................................ 58
The regression approach................................................................................................................. 59
Prediction equation .................................................................................................................... 60
Test of the model ........................................................................................................................ 60
Goodness of fit ............................................................................................................................ 60
Comparison of Model 1 with the null model .............................................................................. 62
Summary ......................................................................................................................................... 63
In context ............................................................................................................................................ 64
Part 6: Meta-regression in CMA.................................................................................................................. 65
What’s new in this version of meta-regression? .................................................................................... 66
The covariates and the predictive model ............................................................................................... 67
Quick Start............................................................................................................................................... 68
Step 1: Enter the data ............................................................................................................................ 69
Insert column for study names ........................................................................................................... 69
Insert columns for effect size data...................................................................................................... 70
Insert columns for moderators (covariates) ....................................................................................... 74
3
Customize the screen.......................................................................................................................... 79
Enter the data ..................................................................................................................................... 82
Step 2: Run the basic meta-analysis ...................................................................................................... 83
The main analysis screen .................................................................................................................... 84
The initial meta-analysis ..................................................................................................................... 84
Display moderator variables ............................................................................................................... 86
Display statistics .................................................................................................................................. 88
Step 3: Run the meta-regression ............................................................................................................ 89
The Interactive Wizard ........................................................................................................................ 90
Add covariates to the model ............................................................................................................... 91
Set computational options.................................................................................................................. 93
Run the regression .............................................................................................................................. 94
Step 4: Navigate the results .................................................................................................................... 95
Main results screen (Fixed effect)....................................................................................................... 95
Main results screen (Random effects) ................................................................................................ 96
Difference between the fixed-effect and random-effects displays .................................................... 97
Plot ...................................................................................................................................................... 98
Other screens ...................................................................................................................................... 99
Step 4: Save the analysis ....................................................................................................................... 100
Step 5: Export the results...................................................................................................................... 102
Part 7: Understanding the results ............................................................................................................. 104
Main results .......................................................................................................................................... 105
Main results, fixed-effect analysis ........................................................................................................ 106
Test of the model ...................................................................................................................... 108
Analysis of variance................................................................................................................... 108
Summary ................................................................................................................................... 112
Main results, random-effects analysis .................................................................................................. 113
Test of the model [D] ................................................................................................................ 117
Goodness of fit [E]..................................................................................................................... 117
Comparison of Model 1 with the null model ............................................................................ 122
Summary ................................................................................................................................... 123
Diagnostics ............................................................................................................................................ 125
Covariance............................................................................................................................................. 132
Correlations........................................................................................................................................... 133
Increments ............................................................................................................................................ 134
4
Part 8: The R2 index ................................................................................................................................... 145
The schematic for R2 ............................................................................................................................. 149
A seeming anomaly ............................................................................................................................... 150
Assessing change in the model ............................................................................................................. 152
Understanding I2 ....................................................................................................................................... 156
Part 9: Working with the plot ................................................................................................................... 159
Confidence interval and prediction interval ......................................................................................... 166
Part 10: Computational options................................................................................................................ 177
Knapp-Hartung vs. Z .............................................................................................................................. 178
One-point or simultaneous confidence intervals for graph ................................................................. 187
Options for estimating τ2 (MM, ML, REML) .......................................................................................... 189
One-sided vs. two-sided tests ............................................................................................................... 191
Part 11: Categorical covariates ................................................................................................................ 192
Dummy variables .............................................................................................................................. 193
Part 12: When does it make sense to omit the intercept ......................................................................... 205
The example ...................................................................................................................................... 205
Interpreting the results ..................................................................................................................... 207
Part 13: Working with “Sets” of covariates .............................................................................................. 211
Defining a “Set” ..................................................................................................................................... 211
How to create a Set............................................................................................................................... 214
How to remove a set ............................................................................................................................. 216
Part 14: Interactions and curvilinear relationships................................................................................... 217
Interaction of two categorical covariates ............................................................................................. 220
Interaction of a categorical covariate with a continuous covariate ..................................................... 227
Interaction of two continuous covariates ............................................................................................. 234
Curvilinear relationships ....................................................................................................................... 240
Part 15: Missing data ................................................................................................................................ 244
Part 16: Filter studies ................................................................................................................................ 254
Part 17: Defining several models .............................................................................................................. 262
Part 18: Complex data structures ............................................................................................................. 273
Independent subgroups within studies ................................................................................................ 273
Using subgroup as the unit of analysis ............................................................................................. 274
Using study as the unit of analysis .................................................................................................... 277
Multiple outcomes or time-points ........................................................................................................ 282
Multiple comparisons ........................................................................................................................... 292
5
Part 19: Some caveats............................................................................................................................... 293
Statistical power for meta-regression .................................................................................................. 294
Multiple comparisons ........................................................................................................................... 295
Part 20: Technical Appendix ..................................................................................................................... 297
Appendix 1: The dataset ...................................................................................................................... 298
Appendix 2: Understanding Q ............................................................................................................... 302
Appendix 3: Tests of heterogeneity ...................................................................................................... 318
Appendix 4: Computing τ2 in the presence of subgroups ..................................................................... 327
Appendix 5: Creating variables for interactions ................................................................................... 331
Appendix 6: Plotting a curvilinear relationship..................................................................................... 334
Appendix 7: Plotting interactions ......................................................................................................... 338
Plotting the interaction of two categorical covariates ..................................................................... 339
Plotting the interaction of a categorical covariate by a continuous covariate ................................. 343
Plotting the interaction of two continuous covariates ..................................................................... 347
Appendix 8: Interpreting regression coefficients ................................................................................. 352
Appendix 9: Meta-regression in stata .................................................................................................. 353
References ................................................................................................................................................ 358
6
Figure 1 | Basic analysis | Random effects | Risk ratio .............................................................................. 15
Figure 2 | Data-entry screen ....................................................................................................................... 16
Figure 3 | Basic analysis | Random effects | Log risk ratio ........................................................................ 18
Figure 4 | Regression of log risk ratio on latitude | Fixed-effect................................................................ 27
Figure 5 | Regression of log risk ratio on latitude | Random-effects ......................................................... 27
Figure 6 | Basic analysis | Fixed-effect | Log risk ratio ............................................................................... 30
Figure 7 | Basic analysis | Fixed-effect | Log risk ratio ............................................................................... 31
Figure 8 | Regression setup | Intercept only .............................................................................................. 31
Figure 9 | Regression | Main results | Fixed-effect | Intercept only ......................................................... 32
Figure 10 | Subgroups Cold vs. Hot | Fixed-effect...................................................................................... 34
Figure 11 | Subgroups Cold vs. Hot | Fixed-effect...................................................................................... 35
Figure 12 | Regression Cold vs. Hot | Setup ............................................................................................... 36
Figure 13 | Regression Cold vs. Hot | Fixed-effect ..................................................................................... 37
Figure 14 | Regression | Latitude | Setup .................................................................................................. 39
Figure 15 | Regression | Latitude | Fixed-effect ........................................................................................ 40
Figure 16 | Basic analysis | Log risk ratio | Random-effects ...................................................................... 42
Figure 17 | Basic analysis | Log risk ratio | Random-effects ...................................................................... 43
Figure 18 | Regression | Intercept | Setup................................................................................................. 44
Figure 19 | Regression | Intercept | Main results | Random-effects......................................................... 45
Figure 20 | Dispersion of effects about grand mean .................................................................................. 47
Figure 21 | Subgroups Cold vs. Hot | Random-effects ............................................................................... 49
Figure 22 | Subgroups Cold vs. Hot | Random-effects ............................................................................... 50
Figure 23 | Option for computing T2 in the presence of subgroups ........................................................... 50
Figure 24 | Option for computing T2 in the presence of subgroups ........................................................... 51
Figure 25 | Regression | Climate | Setup ................................................................................................... 53
Figure 26 | Regression | Climate | Main results | Random-effects ........................................................... 53
Figure 27 | Dispersion of effects about the subgroup means .................................................................... 56
Figure 28 | Dispersion about grand mean vs. dispersion about subgroup means ..................................... 57
Figure 29 | Regression | Latitude | Setup .................................................................................................. 59
Figure 30 | Regression | Latitude | Main results | Random-effects .......................................................... 59
Figure 31 | Dispersion of effects about regression line for latitude........................................................... 61
Figure 32 | Dispersion about grand mean vs. dispersion about regression line ........................................ 63
Figure 33 | Data-entry | Step 01................................................................................................................. 69
Figure 34 | Data-entry | Step 02................................................................................................................. 69
Figure 35 | Data-entry | Step 03................................................................................................................. 70
Figure 36 | Data-entry | Step 04................................................................................................................. 70
Figure 37 | Data-entry | Step 05................................................................................................................. 71
Figure 38 | Data-entry | Step 06................................................................................................................. 72
Figure 39 | Data-entry | Step 07................................................................................................................. 73
Figure 40 | Data-entry | Step 08................................................................................................................. 73
Figure 41 | Data-entry | Step 09................................................................................................................. 74
Figure 42 | Data-entry | Step 10................................................................................................................. 75
Figure 43 | Data-entry | Step 11................................................................................................................. 76
Figure 44 | Data-entry | Step 12................................................................................................................. 77
Figure 45 | Data-entry | Step 13................................................................................................................. 78
Figure 46 | Data-entry | Step 14................................................................................................................. 79
Figure 47 | Data-entry | Step 15................................................................................................................. 80
Figure 48 | Data-entry | Step 16................................................................................................................. 81
7
Figure 49 | Data-entry | Step 17................................................................................................................. 81
Figure 50 | Data-entry | Step 18................................................................................................................. 82
Figure 51 | Data-entry | Step 19................................................................................................................. 83
Figure 52 | Basic analysis | Fixed-effect | Log risk ratio ............................................................................. 84
Figure 53 | Basic analysis | Random-effects | Log risk ratio ...................................................................... 85
Figure 54 | Basic analysis | Display moderators ......................................................................................... 86
Figure 55 | Basic analysis | Display moderators ......................................................................................... 86
Figure 56 | Basic analysis | Display moderators ......................................................................................... 87
Figure 57 | Basic analysis | Display statistics for heterogeneity ................................................................ 88
Figure 58 | Run regression | Step 01 .......................................................................................................... 89
Figure 59 | Run regression | Step 02 .......................................................................................................... 90
Figure 60 | Run regression | Step 03 .......................................................................................................... 91
Figure 61 | Run regression | Step 04 .......................................................................................................... 92
Figure 62 | Run regression | Step 05 .......................................................................................................... 93
Figure 63 | Run regression | Step 06 .......................................................................................................... 94
Figure 64 | Main results | Fixed-effect ....................................................................................................... 95
Figure 65 | Main results | Random-effects ................................................................................................ 96
Figure 66 | Plot ........................................................................................................................................... 98
Figure 67 | Other screens ........................................................................................................................... 99
Figure 68 | Save analysis ........................................................................................................................... 100
Figure 69 | Export results ......................................................................................................................... 102
Figure 70 | Export results ......................................................................................................................... 103
Figure 71 | Setup ...................................................................................................................................... 106
Figure 72 | Main results | Fixed-effect ..................................................................................................... 107
Figure 73 | Plot | Fixed-effect................................................................................................................... 109
Figure 74 | Plot | Year | Fixed-effect ........................................................................................................ 110
Figure 75 | Plot | Latitude | Fixed-effect.................................................................................................. 111
Figure 76 | Run regression | Setup ........................................................................................................... 113
Figure 77 | Main results | Random-effects .............................................................................................. 114
Figure 78 | Main results | Random-effects .............................................................................................. 115
Figure 79 | Dispersion of effects about regression line for latitude......................................................... 118
Figure 80 | Plot | Allocation method | Random-effects .......................................................................... 119
Figure 81 | Plot | Year | Random-effects ................................................................................................. 120
Figure 82 | Plot | Latitude | Random-effects ........................................................................................... 121
Figure 83 | Dispersion of effects about two regression lines ................................................................... 123
Figure 84 | Setup ...................................................................................................................................... 125
Figure 85 | Diagnostics ............................................................................................................................. 125
Figure 86 | Covariance matrix .................................................................................................................. 132
Figure 87 | Correlation matrix .................................................................................................................. 133
Figure 88 | Main results | Random-effects .............................................................................................. 134
Figure 89 | Setup | Intercept only ............................................................................................................ 135
Figure 90 | Main results | Intercept only ................................................................................................. 136
Figure 91 | Setup | Intercept + Allocation ................................................................................................ 137
Figure 92 | Main results | Intercept + Allocation ..................................................................................... 137
Figure 93 | Setup | Intercept + Allocation + Year ..................................................................................... 139
Figure 94 | Main results | Intercept + Allocation + Year .......................................................................... 139
Figure 95 | Setup | Intercept + Allocation + Year + Latitude .................................................................... 141
Figure 96 | Main results | Intercept + Allocation + Year + Latitude ......................................................... 141
8
Figure 97 | Main results | Intercept + Allocation + Year + Latitude ......................................................... 143
Figure 98 | Setup ...................................................................................................................................... 145
Figure 99 | Main results | Latitude | Random-effects ............................................................................. 146
Figure 100 | Dispersion of effects about grand mean vs. dispersion of effects about regression line .... 147
Figure 101 | Display R2 .............................................................................................................................. 149
Figure 102 | Schematic for R2 ................................................................................................................... 149
Figure 103 | Setup .................................................................................................................................... 152
Figure 104 | Display increments ............................................................................................................... 153
Figure 105 | Increments ........................................................................................................................... 154
Figure 106 | Main results | Random-effects ............................................................................................ 157
Figure 107 | Setup .................................................................................................................................... 159
Figure 108 | Main results | Random-effects ............................................................................................ 160
Figure 109 | Plot of log risk ratio on Latitude | Random-effects ............................................................. 161
Figure 110 | Plot of log risk ratio on Latitude | Select variable for X-axis ................................................ 162
Figure 111 | Plot of log risk ratio on Latitude | Blank canvas .................................................................. 163
Figure 112 | Plot of log risk ratio on Latitude | Studies ........................................................................... 164
Figure 113 | Plot of log risk ratio on Latitude | Regression line ............................................................... 165
Figure 114 | Plot of log risk ratio on Latitude | Confidence interval ........................................................ 168
Figure 115 | Plot of log risk ratio on Latitude | Prediction interval ......................................................... 169
Figure 116 | Plot of log risk ratio on Latitude | Identify studies .............................................................. 170
Figure 117 | Regression | Setup ............................................................................................................... 173
Figure 118 | Regression | Main results | Random-effects ....................................................................... 174
Figure 119 | Regression | Plot | Categorical covariate ............................................................................ 175
Figure 120 | Regression | Plot | Setting the scale anchors ...................................................................... 176
Figure 121 | Regression | Set statistical options ...................................................................................... 177
Figure 122 | Regression | Setup ............................................................................................................... 180
Figure 123 | Set statistical options | Z-Distribution vs. Knapp-Hartung .................................................. 181
Figure 124 | Main results | Z-Distribution ................................................................................................ 182
Figure 125 | Set statistical options | Z-Distribution vs. Knapp-Hartung .................................................. 183
Figure 126 | Main results | Knapp-Hartung ............................................................................................. 184
Figure 127 | Set statistical options | One-point confidence intervals...................................................... 188
Figure 128 | Set statistical options | Simultaneous confidence intervals ................................................ 188
Figure 129 | Set statistical options | Estimating T2 .................................................................................. 189
Figure 130 | Creating dummy variables ................................................................................................... 193
Figure 131 | Creating dummy variables ................................................................................................... 194
Figure 132 | Categorical variables ............................................................................................................ 195
Figure 133 | Creating dummy variables ................................................................................................... 195
Figure 134 | Dummy variables | Allocation with “Randomized” as the reference group........................ 197
Figure 135 | Dummy variables | Allocation with “Systematic” as the reference group .......................... 198
Figure 136 | Dummy variables | Allocation with “Alternate” as the reference group ............................ 199
Figure 137 | Subgroups | Allocation type................................................................................................. 203
Figure 138 | Data-entry | Dummy variables for Hot and Cold ................................................................. 206
Figure 139 | Basic analysis | Computing T2 in the presence of subgroups .............................................. 207
Figure 140 | Basic analysis | Subgroups Cold vs. Hot ............................................................................... 207
Figure 141 | Basic analysis | Subgroups Cold vs. Hot ............................................................................... 208
Figure 142 | Regression | Setup | No intercept ....................................................................................... 208
Figure 143 | Regression | Main results | No intercept ............................................................................ 209
Figure 144 | Regression | Main results | Assessing the impact of a set .................................................. 212
9
Figure 145 | Main results | Assessing the impact of a set........................................................................ 213
Figure 146 | Setup | Defining a set of covariates ..................................................................................... 214
Figure 147 | Setup | Naming a set of covariates ...................................................................................... 215
Figure 148 | Setup | Naming a set of covariates ...................................................................................... 215
Figure 149 | Regression | Main results | Working with a set of covariates............................................. 216
Figure 150 | Main results | Removing a set of covariates........................................................................ 216
Figure 151 | Setup | Interaction of two categorical covariates ............................................................... 220
Figure 152 | Main results | Interaction of two categorical covariates ..................................................... 221
Figure 153 | Plot | Interaction of two categorical covariates .................................................................. 223
Figure 154 | Plot | Interaction of two categorical covariates .................................................................. 224
Figure 155 | Plot | Interaction of two categorical covariates .................................................................. 225
Figure 156 | Setup | Interaction of categorical and continuous covariates............................................. 227
Figure 157 | Main results | Interaction of categorical and continuous covariates .................................. 228
Figure 158 | Plot | Interaction of categorical and continuous covariates................................................ 230
Figure 159 | Plot | Interaction of categorical and continuous covariates................................................ 231
Figure 160 | Plot | Interaction of categorical and continuous covariates................................................ 232
Figure 161 | Setup | Interaction of two continuous covariates ............................................................... 234
Figure 162 | Main results | Interaction of two continuous covariates .................................................... 235
Figure 163 | Plot | Interaction of two continuous covariates .................................................................. 236
Figure 164 | Plot | Interaction of two continuous covariates .................................................................. 237
Figure 165 | Plot | Interaction of two continuous covariates .................................................................. 238
Figure 166 | Setup | Curvilinear relationship ........................................................................................... 240
Figure 167 | Main results | Curvilinear relationship ................................................................................ 241
Figure 168 | Plot | Curvilinear relationship .............................................................................................. 242
Figure 169 | Setup .................................................................................................................................... 244
Figure 170 | Data-entry | Missing data for latitude ................................................................................. 245
Figure 171 | Basic analysis | Missing data for latitude ............................................................................. 246
Figure 172 | Regression | Setup | Latitude in list and checked ............................................................... 246
Figure 173 | Regression | Main results | Missing data ............................................................................ 247
Figure 174 | Table of missing data............................................................................................................ 248
Figure 175 | Setup | Latitude in list, unchecked ...................................................................................... 249
Figure 176 | Main results | Latitude in list, unchecked ............................................................................ 250
Figure 177 | Setup | Latitude must be removed from list........................................................................ 251
Figure 178 | Setup | Latitude removed from list...................................................................................... 252
Figure 179 | Main results | Latitude removed from list ........................................................................... 252
Figure 180 | Data entry............................................................................................................................. 254
Figure 181 | Basic analysis ........................................................................................................................ 254
Figure 182 | Meta-regression ................................................................................................................... 255
Figure 183 | Select by study name ........................................................................................................... 255
Figure 184 | Select by study name ........................................................................................................... 256
Figure 185 | Create a moderator for filtering ........................................................................................... 257
Figure 186 | Filter by moderator .............................................................................................................. 258
Figure 187 | Regression using a filter ....................................................................................................... 259
Figure 188 | Select by moderator ............................................................................................................. 260
Figure 189 | Select by moderator ............................................................................................................. 261
Figure 190 | Filter by moderator .............................................................................................................. 261
Figure 191 | Defining several models | Setup .......................................................................................... 262
Figure 192 | Defining several models | Setup .......................................................................................... 262
10
Figure 193 | Defining several models | Main-analysis | Intercept........................................................... 263
Figure 194 | Defining several models | Main-analysis | Intercept + year ................................................ 264
Figure 195 | Defining several models | Main-analysis | Intercept + year + latitude ............................... 265
Figure 196 | Defining several models | Main-analysis | Intercept + year + latitude ............................... 266
Figure 197 | Defining several models | Main-analysis | Intercept + year + latitude ............................... 267
Figure 198 | Defining several models | Setup .......................................................................................... 268
Figure 199 | Defining several models | Main-analysis | Intercept + year + latitude ............................... 269
Figure 200 | Defining several models | Setup| Year or Latitude ............................................................. 269
Figure 201 | Multiple predictive models | Plot based on Year ................................................................ 270
Figure 202 | Multiple predictive models | Plot based on Year + Latitude ............................................... 271
Figure 203 | Data-entry | Complex data-structures ................................................................................. 273
Figure 204 | Data-entry | Complex data-structures ................................................................................. 274
Figure 205 | Basic analysis | Subgroup within-study as unit of analysis .................................................. 274
Figure 206 | Basic analysis | Subgroup within-study as unit of analysis .................................................. 275
Figure 207 | Regression | Subgroup within-study as unit of analysis ...................................................... 276
Figure 208 | Regression | Subgroup within-study as unit of analysis ...................................................... 277
Figure 209 | Basic analysis | Study as unit of analysis.............................................................................. 278
Figure 210 | Basic analysis | Study as unit of analysis.............................................................................. 278
Figure 211 | Regression | Study as unit of analysis .................................................................................. 279
Figure 212 | Regression | Study as unit of analysis .................................................................................. 280
Figure 213 | Data-entry | Multiple outcomes .......................................................................................... 282
Figure 214 | Data-entry | Multiple outcomes .......................................................................................... 282
Figure 215 | Basic analysis | Multiple outcomes | Select one outcome .................................................. 283
Figure 216 | Basic analysis | Multiple outcomes | Use all outcomes, assuming independence ............. 283
Figure 217 | Basic analysis | Multiple outcomes | Use all outcomes, assuming independence ............. 284
Figure 218 | Multiple outcomes | Setup .................................................................................................. 284
Figure 219 | Multiple outcomes | Use all outcomes, assuming independence ....................................... 285
Figure 220 | Multiple outcomes | Use all outcomes, assuming independence ....................................... 286
Figure 221 | Basic analysis | Multiple outcomes | Use mean of outcomes ............................................. 287
Figure 222 | Basic analysis | Multiple outcomes | Use mean of outcomes ............................................. 287
Figure 223 | Multiple outcomes | Use mean of outcomes ...................................................................... 289
Figure 224 | Multiple outcomes | Use mean of outcomes ...................................................................... 290
Figure 225 | BCG Data in Excel™............................................................................................................... 298
Figure 226 | BCG Data in Excel™............................................................................................................... 299
Figure 227 | Flowchart showing how T2 and I2 are derived from Q ......................................................... 304
Figure 228 | Case-A | Main results | Fixed-effect weights....................................................................... 305
Figure 229 | Case-A | Computing Q.......................................................................................................... 306
Figure 230 | Case-A | Main results |Random-effects .............................................................................. 307
Figure 231 | Case-A | Dispersion of effects about regression line ........................................................... 308
Figure 232 | Case-B | Main results | Fixed-effect weights ....................................................................... 309
Figure 233 | Case-B | Computing Q .......................................................................................................... 310
Figure 234 | Case-B | Main results |Random-effects............................................................................... 311
Figure 235 | Case-B | Dispersion of effects about regression line ........................................................... 312
Figure 236 | Case-C | Main results | Fixed-effect weights ....................................................................... 313
Figure 237 | Case-C | Computing Q .......................................................................................................... 314
Figure 238 | Case-C | Main results |Random-effects............................................................................... 315
Figure 239 | Case-C | Dispersion of effects about regression line ........................................................... 316
Figure 240 | Heterogeneity statistics in basic analysis ............................................................................. 318
11
Figure 241 | Heterogeneity statistics in regression .................................................................................. 319
Figure 242 | Heterogeneity statistics in regression .................................................................................. 320
Figure 243 | Heterogeneity statistics with subgroups.............................................................................. 321
Figure 244 | Heterogeneity statistics with subgroups.............................................................................. 322
Figure 245 | Heterogeneity statistics with subgroups.............................................................................. 323
Figure 246 | Heterogeneity statistics with continuous covariate ............................................................ 324
Figure 247 | Heterogeneity statistics with continuous covariate ............................................................ 325
Figure 248 | Computing τ2 in the presence of subgroups ........................................................................ 327
Figure 249 | Computing τ2 in the presence of subgroups ........................................................................ 328
Figure 250 | Computing τ2 in the presence of subgroups ........................................................................ 328
Figure 251 | Computing τ2 in the presence of subgroups ........................................................................ 329
Figure 252 | Computing τ2 in the presence of subgroups ........................................................................ 330
Figure 253 | Creating variables for interactions ....................................................................................... 331
Figure 254 | Creating variables for interactions ....................................................................................... 332
Figure 255 | Creating variables for interactions ....................................................................................... 332
Figure 256 | Creating variables for interactions ....................................................................................... 333
Figure 257 | Creating variables for interactions ....................................................................................... 333
Figure 258 | Plotting a curvilinear relationship ........................................................................................ 334
Figure 259 | Plotting a curvilinear relationship ........................................................................................ 335
Figure 260 | Plotting a curvilinear relationship ........................................................................................ 336
Figure 261 | Plotting a curvilinear relationship ........................................................................................ 336
Figure 262 | Plotting a curvilinear relationship ........................................................................................ 337
Figure 263 | Plotting interaction of two categorical covariates ............................................................... 339
Figure 264 | Plotting interaction of two categorical covariates ............................................................... 340
Figure 265 | Plotting interaction of two categorical covariates ............................................................... 340
Figure 266 | Plotting interaction of two categorical covariates ............................................................... 342
Figure 267 | Plotting interaction of categorical by continuous covariates .............................................. 343
Figure 268 | Plotting interaction of categorical by continuous covariates .............................................. 344
Figure 269 | Plotting interaction of categorical by continuous covariates .............................................. 346
Figure 270 | Plotting interaction of continuous covariates ...................................................................... 348
Figure 271 | Plotting interaction of continuous covariates ...................................................................... 348
Figure 272 | Plotting interaction of continuous covariates ...................................................................... 351
Figure 273 | CMA | Intercept + Year + Latitude + Allocation | Z | Method of moments ........................ 354
Figure 274 | Metareg| Intercept + Year + Latitude + Allocation | Z | Method of moments ................... 354
Figure 275 | CMA | Allocation | Z | Method of moments ....................................................................... 355
Figure 276 | Metareg | Allocation | Z | Method of moments ................................................................. 355
Figure 277 | CMA | Allocation, Year | Z | Method of moments .............................................................. 356
Figure 278 | Metareg | Allocation, Year | Z | Method of moments ........................................................ 356
Figure 279 | CMA | Intercept, Year-C, Year-C2 | Z | Method of moments .............................................. 357
Figure 280 | Metareg | Intercept, Year-C, Year-C2 | Z | Method of moments........................................ 357
12
This manual
CMA program
http://www.meta-analysis.com/pages/cma_manual.php
http://www.meta-analysis.com/
BCG data in CMA format
File using period for decimals
File using comma for decimals
BCG data in Excel™ format
Excel™ files for plotting interactions
13
PART 2: OVERVIEW OF META-REGRESSION
INTRODUCTION
In primary studies, multiple-regression is the statistical technique employed to assess the relationship
between covariates and a dependent variable. In these studies the unit of analysis is the subject, with
covariates and the outcome measured for each subject.
With a few modifications, the same technique can be used in meta-analysis. In this case, the unit of
analysis is the study, with covariates and outcomes measured for each study. We sometimes use the
term “meta-regression” to refer to the use of regression in meta-analysis.
With these modifications in place, the full arsenal of procedures that fall under the heading of “multipleregression” in primary studies becomes available to the meta-analyst. For example,
•
•
•
•
We can assess the impact of one covariate, or the combined impact of multiple covariates
We can enter covariates into the analysis using a pre-defined sequence and assess the impact of
any covariates, over and above the impact of prior covariates
We can work with sets of covariates, such as three variables that together define a treatment, or
that represent a nonlinear relationship between the predictor variable and the effect size.
We can incorporate both categorical (for example, dummy-coded) and continuous variables as
covariates.
This book is intended as a resource to explain how to run and interpret a meta-regression. It is also
intended as a manual to show how to use the program CMA to perform a meta-regression.
In Part 1 we provide links to the data files referenced in this manual
In Part 2 we provide an overview of this manual
In Part 3 we introduce the BCG example
In Part 4 we explain that meta-regression is an observational analysis
In Part 5 we provide an overview of fixed-effect vs. random-effects models
In Part 6 we provide a step-by-step guide for running a meta-regression in CMA
In Part 7 we discuss how to interpret the results of a meta-regression
In Part 8 we discuss the computation and meaning of R2
In Part 9 we discuss how to use and customize the regression plot
In Part 10 we discuss computational options
In Part 11 we explain how to work with categorical variables
In Part 12 we discuss when it makes sense to omit the intercept
In Part 13 we discuss how to work with “sets” of covariates
In Part 14 we explain how to work with interactions
In Part 15 we discuss missing data
In Part 16 we show how to run a regression on subsets of the studies
In Part 17 we show how to define and compare different predictive models
In Part 18 we explain how to work with complex data structures
In Part 19 we discuss some caveats about meta-regression
In Part 20 we provide a technical appendix
14
PART 3: THE BCG EXAMPLE
We will use the “BCG analysis” as the motivating example in this book.
“BCG” refers to the Bacillus Calmette-Guerin (BCG) vaccine, which is intended to prevent tuberculosis
(TB). This vaccine had been studied in a series of 13 controlled trials between the years 1933 and 1968,
with some trials suggesting that the vaccine was effective in reducing the incidence of TB, and others
suggesting that it was not. With the re-emergence of TB in the United States in recent years (including
many drug-resistant cases) the question of whether or not BCG was actually effective took on a new
urgency. Colditz et al. (1994) conducted a meta-analysis to synthesize the data from these trials.
Figure 1 shows a random-effects meta-analysis based on the BCG studies. The effect size is the risk ratio,
as indicated by the labels [A]. In this example a risk ratio of less than 1.0 indicates that the vaccine
reduced the risk of TB, a risk ratio of 1.0 indicates no effect, and a risk ratio higher than 1.0 indicates
that the vaccine increased the risk of TB.
The summary risk ratio [B] is 0.4896 with a 95% confidence interval of 0.3449 to 0.6950, and a p-value of
0.0001. Thus, there is strong evidence that the vaccine is effective in preventing TB.
A
D
C
Figure 1 | Basic analysis | Random effects | Risk ratio
B
Equally important, however, is the variation in the treatment effect, with the risk ratio in individual
studies ranging from 0.2049 (approximately an 80% risk reduction) in one study [C] to 1.5619
(approximately a 56% risk increase) in another [D]. While some of the observed variance in effects is
probably due to sampling error, a substantial amount of the variance reflects real differences in the
treatment effect (more on this later). Obviously, it’s imperative to understand why the vaccine is more
effective in some studies than in others.
15
Among the studies in the meta-analysis, there appears to be a relationship between climate and
effectiveness, such that studies performed in colder locations tended to show a stronger effect. If this
relationship is real, it could be explained by either of two mechanisms. First, persons in colder climates
may be less likely to have a natural immunity to TB. It follows that the population in these climates
would be more susceptible to TB, and more likely to benefit from the vaccine. Second, it’s likely that the
drug would be more potent in the colder climates. This follows from the fact that in warmer climates
the heat could cause the drug to lose potency.
Optimally, a researcher would be able to code each study for the prevalence of natural immunity and for
drug potency, and then use these as predictors of effectiveness in a meta-regression. However, these
predictors were not available for the analysis. Therefore, the analysts elected to use “Latitude”
(actually, the absolute value of latitude) as a surrogate for these covariates, the assumption being that
studies more distant from the equator tended to employ populations with less natural immunity and
more potent vaccine.
This was the strategy adopted by Berkey et al. (1995), who employed meta-regression to assess the
relationship between latitude and treatment effect. Given the post hoc nature of this analysis, a
positive finding would not be definitive, but would suggest a direction for additional research.
This regression has been used as an example in many texts on meta-analysis, including Borenstein et al.
2009, Egger et al. 2001, Sutton et al 2000, Hartung et al. (2008)). We will use this example here as well,
to allow readers to compare our analysis with the analyses presented in the other texts. Note that each
text presents the data in a somewhat different format, and here we follow the format employed by
Hartung. In addition to the original variables, we created new variables for the purposes of this text.
For example, we classified studies as “Hot” or “Cold” based on the latitude, created variables that are
centered versions of the originals, and created variables to represent interactions among the original
variables.
Part of the dataset is shown in Figure 2. The full set of variables is described in Table 1. In [Appendix 1:
The dataset] we show how to create all of the new variables in Excel™.
Figure 2 | Data-entry screen
16
Table 1
Variable
Type
Description
Latitude
Integer
Allocation
Categorical
Year
Integer
This is absolute latitude, which is simply the latitude ignoring the sign.
Low values are closer to the equator, high values are more distant.
This is the method employed to assign people to the treated or control
conditions. The three possible classifications are randomized, alternate,
and systematic.
This is the year the study was conducted.
Original variables
Variables related to climate
Latitude-C
Latitude-C2
Climate
Decimal
Decimal
Categorical
Hot
Cold
Integer
Integer
Latitude, centered to have a mean of zero
The square of Latitude-C
We dichotomized latitude, and classified each study’s location as Hot
(latitude under 34) or Cold (latitude over 41)
This is a numeric version of Climate, coded 1 for Hot and 0 for Cold.
This is a numeric version of Climate, coded 1 for Cold and 0 for Hot.
Variables related to study year
Year-C
Time
Categorical
Early
Recent
Integer
Integer
Year, centered to have a mean of zero
We dichotomized Year, and classified each study’s time as Early (pre1945) or Recent (post-1945)
This is a numeric version of Time, coded 1 for Hot and 0 for Cold.
This is a numeric version of Time, coded 1 for Cold and 0 for Hot.
Decimal
Integer
Decimal
Interaction of Hot and Year-C (Dichotomous and continuous)
Interaction of Hot and Recent (Dichotomous and Dichotomous)
Interaction of Latitude-C and Year-C (Continuous and Continuous)
Interactions
Hot x Year-C
Hot x Recent
Latitude-C x Year-C
The letter “C” added to a variable indicates that the variable has been centered (See appendix)
Types of variables
Categorical
Integer
Decimal
Studies belong to discrete groups, such as “Cold” and “Hot”. This cannot be used directly
in the analysis, but is used to create numeric variables (dummy-variables).
Studies have values on an integer scale, and can take on any whole number.
Studies have values on a continuous scale and can take on any whole or decimal number.
17
In Figure 1 the treatment effect (or effect size) was displayed as a risk ratio. While the risk ratio has the
advantage of being an intuitive index, the analyses are actually performed using the log of the risk ratio,
and then converted to risk ratios for display. Since one of our goals in this book is to explain the
mechanics of the analyses, we will generally be working with the log units.
For example, where Figure 1 showed the risk ratios, Figure 3 shows the same forest plot using log units,
as indicated by the labels [E]. In this case, a log risk ratio of less than 0.0 indicates that the vaccine
reduced the risk of TB, a log risk ratio of 0.0 indicates no effect, and a log risk ratio greater than 0.0
indicates that the vaccine increased the risk of TB. The summary effect [F] is a log risk ratio of −0.7141.
E
Figure 3 | Basic analysis | Random effects | Log risk ratio
F
18
PART 4: META-REGRESSION IS OBSERVATIONAL
When we work with primary studies we need to be aware of the difference between a randomized
study and an observational study.
In a randomized trial, participants are assigned at random to a condition, such as treatment versus
placebo. The randomization is intended to ensure that the subjects in the two conditions are similar in
all respects except for the treatment. Therefore, assuming that the randomization works properly, any
differences that emerge between groups can be attributed to the treatment.
By contrast, in an observational study we compare pre-existing groups, such as workers with a college
education versus those who did not attend college. While we can report on differences between groups,
we cannot attribute these differences to the presence or absence of a college education because the
groups differ in other ways as well. For example, it is possible (indeed likely) that subjects who had a
college education also had other advantages, including better skills in an array of areas.
Consider how this plays out in a meta-analysis.
Assume we start with a set of randomized experiments that assess the impact of an intervention. If the
effect in each study can serve to establish causality because of the randomization process, then the
summary effect can also serve to establish causality. If the effect in each study is due to the
intervention, then the overall effect is due to the intervention.
However, even if the individual studies are randomized trials, once we move beyond the goal of
reporting a summary effect and proceed to perform a subgroup analyses or meta-regression, we have
moved out of the domain of randomized experiments, and into the domain of observational studies.
In this example, if the effect size is different in hot climates than in cold climates we cannot assume that
this is because of climate. While we choose to label one group of studies “Hot climate” and another
“Cold climate”, it is possible that the two groups differ from each other in other ways as well, and that it
might be these other factors (instead of climate, or in addition to climate) that are responsible for the
difference in effects.
For example, it turns out that the “Hot” studies, where the vaccine was less effective, tend to be more
recent. We’d like to think that the vaccine was less effective because of storage conditions in the hot
climates, and that if we used better storage the efficacy would improve. However, it’s also possible that
there were unrelated changes over the decades that caused the vaccine to become less effective.
That said, in primary observational studies, researchers sometimes use regression analysis to try and
remove the impact of potential confounders. This is not a perfect solution since there may be other
confounders of which we are not aware, but this approach can help to isolate the impact of specific
factors and generate hypotheses to be tested in randomized trials. The same holds true for metaregression.
19
This approach is potentially useful only when there are enough studies to isolate the unique impact of
each factor. Many meta-regressions are based on relatively small numbers of studies, and so it may not
be possible to adjust for potential confounds.
Of course, even when there are enough studies to adjust for known confounds, we cannot be certain
that we’ve identified all possible confounds. Therefore we can’t use this approach to prove a causal
relationship.
There is one exception to the rule that subgroup analysis and regression cannot prove causality. This is
the case where not only was assignment to treatment condition randomized, but also assignment to
subgroup was randomized. In this case we know that the only systematic difference between subgroups
is the one captured by subgroup membership.
The pharmaceutical example discussed later is a case in point. In this hypothetical example we enrolled
1000 patients and assigned some to studies that would test a low dose of the drug vs. placebo, and
others to studies that would test a high dose of the drug vs. placebo. Here, the assignment to subgroups
is random. The same would apply if the patients were assigned to ten studies where the dose of drug
was varied on a continuous scale, and we used meta-regression to test the relationship between dose
and effect size.
This set of circumstance (the drug company example) is extremely rare in practice. We present it here
primarily to illustrate the conditions that would be needed before we could draw a causal inference
from a subgroup analysis or regression. Our main point is that in the absence of these conditions we
cannot draw a causal inference.
20
PART 5: FIXED-EFFECT VS. RANDOM-EFFECTS
Statistical models
Before turning to regression, we briefly review the statistical models typically employed in meta-analysis
– the fixed-effect model and the random-effects model.
To understand the difference between these two models it might help to distinguish between the ideas
of a population versus a universe.
One study samples subjects from a population, which is defined as people meeting a specific set of
criteria. A second study samples subjects from a population, which is defined as people meeting a
specific set of criteria. If the two sets of criteria are the same in all relevant respects, then we can say
that both studies are sampling from the same population. Equivalently, if the true effect size (the effect
size that we would see if there was no sampling error) is identical in both populations, then we can think
of them as the same population.
However, if the criteria differ in any material respect, then we would say that the two studies sample
from different populations. In this case each study has its criteria that define the study’s population,
and we need a second set of criteria that tell us what kinds of studies (what populations) we want to
include in the meta-analysis. This second set of criteria defines the universe of populations.
For example, suppose that each study samples persons from the lung-cancer outpatient clinic at the
hospital where the study is being conducted.
•
If there are ten studies in the analysis and all were conducted at the same hospital (let’s also
assume at the same time) then all studies are being sampled from the same population. The
true effect size in all the studies is the same.
•
If there are ten studies in the analysis and each was conducted at a different hospital, then each
study is sampled from a different population. The true effect size probably differs (possibly by a
little, or possibly by a lot) from one hospital to the next.
In the second case (ten hospitals) if we decide to include all these hospitals in the meta-analysis, it’s
because all the populations are from the same universe. We would probably define the universe as
clinics where the patients are similar enough so that the studies are all addressing the same
fundamental question.
To this point we’ve focused on the patients in discussing the difference between a population and a
universe, but the distinction between the two depends also on other aspects of the study. For example,
studies may be from the same population if they all run for exactly two weeks. If some studies run for
two weeks and others for three weeks, then the two kinds of studies are based on different populations
but are drawn from the same universe. Similarly, studies may be from the same population if they all
use the identical measure of outcome. If some studies use one measure while others use a similar
measure, then they are based on different populations but are drawn from the same universe.
21
It should be clear that we are defining a population very narrowly. Under this definition studies are only
drawn from the same population if they are essentially replicates of each other in all material respects.
This means that not only the subjects but also the methods, the specifics of the intervention, and the
outcome measures are, for all intents and purposes, identical across studies. This criterion will rarely be
met in practice, and will never be met when we’re working with studies conducted independently of
each other.
With this as background we can discuss the difference between the fixed-effect and the random-effects
models.
•
The fixed-effect model applies if all studies are drawn from a single population (the identical
subjects and methods). The studies share a common effect size, and so the effect size is fixed,
or constant.
•
The random-effects model applies if the studies are drawn from a universe of populations. The
true effect size varies from one population to the next, and the studies are sampled at random
from this universe.
The selection of a model is critically important for several reasons.
First, it sets up a framework for the analysis, establishing what questions we can ask and how to
interpret the results.
•
Under the fixed-effect model we assume that all studies share a common true effect size, and
our goal is to estimate this common parameter.
•
Under the random-effects model we allow that the true effect size might vary from study to
study, and our goal is to estimate the mean of these parameters.
Second, the selection of a model affects how weights are assigned to the studies. This affects both the
summary estimates themselves and also the precision of the summary estimates.
•
Under the fixed-effect model there is only one level of sampling (the subjects in each study are
sampled from all subjects in the study’s population) and therefore only one source of sampling
error (the observed effect in each study differs from the true effect for that study’s population).
The error variance for each study is quantified as V, and the weight assigned to each study is the
inverse of this variance, or 1/V.
•
Under the random-effects model there are two levels of sampling (the subjects in each study are
sampled from all subjects in the study’s population, and the study populations are sampled from
the universe of study populations) and therefore two sources of sampling error (the observed
effect in each study differs from the true effect for that study’s population, and the mean true
effect for the sampled studies differs from the mean for the universe of studies). The first error
variance is quantified as V and the second as T2. The total error variance for each study is then
V+T2, and the weight assigned to each study is the inverse of this variance, or 1/(V+T2)
22
We typically encounter the idea of fixed-effect vs. random-effects models in the context of a simple
analysis, where we have a single set of studies. In this case, the distinction between the two models is
relatively straightforward: the fixed-effect model applies if the studies share a common effect size, and
the random-effects model applies otherwise.
However, the same idea can be extended to the case where we have discrete subgroups of studies, and
even further, to the case where we have studies that range along a continuum on some dimension(s).
Our goal here is to explain these extensions, and for that purpose we will show how the difference
between fixed-effect and random-effects models plays out in (A) a simple analysis, (B) a subgroups
analysis, and (C) a meta-regression.
Case A - A simple analysis
Consider the case where we are working with a single set of studies. The fixed-effect model applies if all
the studies share a common effect size. The random-effects model applies if the effect size might vary
from study to study, across all studies in the analysis.
Case B - Subgroups
Consider the case where we want to compare the effect size for two or more subgroups of studies (for
example, studies in cold climates vs. studies in hot climates). The fixed-effect model applies if all studies
within a subgroup share the same effect size parameter. The random-effects model applies if the effect
size parameters might vary from study to study, for studies within a subgroup.
Case C – Regression
Consider the case where we want to look at effect size in relation to a continuous covariate (for
example, latitude). The fixed-effect model applies if all studies at the same latitude share the same
effect size parameter. The random-effects model applies if the effect size parameters vary from study
to study, for studies at the same latitude.
While we have presented these three cases as being distinct from each other, the fact is that they can all
be subsumed under the same general principle. Specifically,
•
•
We use the fixed-effect model when all studies with the same predicted value have the same
true effect size.
We use the random-effects model when studies with the same predicted value have different
true effect sizes.
Thus,
•
•
•
In Case A the frame of reference is all studies, and the predicted value for any study is the mean
of all studies.
In Case B the frame of reference is all studies within the subgroup, and the predicted value for
any study is the corresponding subgroup mean.
In Case C the frame of reference is all studies at the same latitude, and the predicted value for
any study is the predicted value given by the regression equation.
23
If the predicted value for all studies within the frame of reference is the same, then the only source of
variance is within-study variance (V) and so the fixed-effect model applies. Otherwise we need to
account also for between-study variance (T2) and so the random-effects model applies. Here, we present
three examples to show how this idea applies to the three cases outlined above.
Case A - A simple analysis
Case A1. A pharmaceutical company draws a sample of 1000 patients for a randomized controlled trial,
but cannot work with all at the patients at one time because of limited space. Therefore, the patients
are randomly assigned to one of ten cohorts, and each cohort starts treatment in a different week. If we
assume no training effect and no seasonal effect, and ensure that all procedures are identical from one
cohort to the next, it follows that the treatment effect should be the same for all cohorts. If we treat
each cohort as a separate study and use meta-analysis to synthesize the results, then the fixed-effect
model would apply.
Case A2. Ten universities form a consortium to run studies with the identical protocol (one at each
university) and then use meta-analysis to synthesize the results. This is similar to Case A1, but the
assumption that the effect size will be the same in all studies is more tenuous. It might be possible to
ensure that the intervention is identical at all the universities, but it might not. It might be possible to
ensure that the subjects are identical in all relevant respects at all the universities, but it might not.
Therefore, it’s possible that the fixed-effect model would apply, but it might not. In this case it would
probably be a good idea to use the random-effects model. If it turns out empirically that the effect sizes
do differ, then the random-effects model will have been the correct choice. If it turns out empirically
that the effect sizes do not differ, the random-effects weights will be identical to the fixed-effect
weights, and so there is no price to pay for having chosen the random-effects model.
Case A3. The vast majority of simple meta-analyses are not similar to Case A1 nor to Case A2. Rather,
they involve studies planned and performed by different teams of researchers without prior
coordination. For example, we may locate 10 studies in the literature that seem to address the same
fundamental question. We may decide that the studies are similar enough that it makes sense to
perform a synthesis (for example, they all tested the same intervention) but there is no reason to
assume that the true effect size will be identical in all studies. Rather, it is likely that the impact of the
intervention will be affected (at least a little) by details of the sample (age, history), of the intervention
itself (dose, duration), and/or the outcome measure (one test or another). If the studies are
fundamentally the same (in the sense that they address the same basic question) then meta-analysis
may enable us to identify the core impact of the intervention, cutting through the noise created by
these differences. However, to identify this impact properly we need to take account of this noise by
using random-effects weights. In this case, the random-effects model is a more plausible fit for the
data.
Case B - Subgroups
The same idea is easily extended to the case of an analysis where we want to compute the summary
effect size for two subgroups of studies, and then compare the two.
Case B1. In Case A1 (above), the pharmaceutical company randomly assigned patients to one of ten
identical cohorts to test Drug-A vs. Control. Suppose that the next year the company did exactly the
same thing with a new sample of patients, to test Drug-B vs. Control. The Drug-A studies are Subgroup24
A and the Drug-B studies are Subgroup-B. There is no expectation that the effect size will be the same in
the two subgroups. Indeed, we may expect that the effect size will be greater in one subgroup than the
other. However, we do expect all studies in the first group to have the same effect size as each other,
and all studies within the second group to have the same effect size as each other. The fixed-effect
model applies here.
Case B2. In Case A2 (above), the consortium planned to run ten identical studies (one at each
university). They hoped that studies would be identical in all ways that could impact on the effect size,
but could not ensure that this would be the case. Suppose that the next year they continued the
arrangement, and ran ten more studies, this time using a variant of the intervention. The same logic
applies as in Case A2, the only difference being that the logic now applies within subgroups. While it’s
possible that all studies within a subgroup share the same effect size, it’s also possible that the effect
size varies. Therefore, it’s probably safer to apply the random-effects model to compute the summary
effect size within subgroups.
Case B3. In Case A3 (above), researchers located 10 published studies that were similar, but not
identical. Logic dictates that the effect size will vary from study to study, and therefore the randomeffects model is appropriate. The same example can be extended to subgroups. Suppose researchers
located 10 studies that assessed the impact of Drug-A vs. Control (Subgroup-A), and another 10 that
assessed the impact of Drug-B vs. control (Subgroup-B). We may expect that the effect sizes within
either subgroup to be similar to each other, but we have no reason to expect that they will be identical
to each other. Therefore, the random-effects model is a more plausible fit for the data within
subgroups.
Case C – Regression
Finally, the same idea can be extended to the case where we use a continuous covariate (or a set of
covariates) to predict effect size.
Case C1. In Case A1 (above), the pharmaceutical company randomly assigned patients to one of ten
identical cohorts to test Drug-A vs. Control. We can extend Case A1 to a situation where the company
runs ten identical cohorts, and assume that it repeats this process five times, each time with a different
dose of the drug. The meta-analysis looks at the relationship of dose with effect size. For all studies at
the same dose, the effect size should be the same. The fixed-effect model makes sense here.
Case C2. In Case A2 (above), the consortium planned to run ten identical studies (one at each
university). We can extend Case A2 to a situation where the consortium runs ten studies based on the
same protocol, and assume that it repeats this process five times, each time with a longer intervention.
The meta-analysis looks at the relationship of the intervention’s duration with effect size. For all studies
of the same duration, the effect size might be the same, but might not. The random-effects model is
probably the better choice.
Case C3. In Case A3 (above), when studies are performed using different protocols, logic dictates that
the effect size will vary from study to study. We can extend Case A3 to a situation where we locate all
studies that assessed the impact of an intervention, and then code them based on the dosage. While
studies with a similar dose may tend to have similar effect sizes, logic dictates that the effect size for any
given dose will still vary. The random-effects model is a more plausible fit for the data.
25
How the model affects the analysis
The selection of a statistical model must be based on the sampling frame (as outlined above) and not
the fact that one model will yield a more desirable estimate of the effect size or its precision (as outlined
below). That said, it’s helpful to understand how the selection of one model or the other affects the
estimates of effect size and precision.
Again, it’s easiest to explain this for the simple case, and then extend the example to the case of
subgroups and regression.
Recall that V represents the within-study variance (the variance of the observed effects about the true
effect for that study’s population), while T2 represents the between-study variance (the variance of true
effects about the mean true effects for all studies in the universe of populations).
The basic idea is that uncertainty (and therefore weights) for the fixed-effect model weights are based
on V, whereas uncertainty (and therefore weights) for the random-effects model weights are based on V
+ T2. The only thing that changes as we move from Case A to Case B to Case C is the frame of reference
for estimating T2, as follows.
•
•
•
When we are working with a single population, T2 reflects the dispersion of true effects across
all studies, and is therefore computed for the full set of studies.
When we are working with subgroups, T2 reflects the dispersion of true effects within a
subgroup, and is therefore computed within subgroups.
When we are working with regression, T2 reflects the dispersion of true effects for studies with
the same predicted value (that is, the same value on the covariates) and is therefore computed
for each point on the prediction slope. As a practical matter, of course, most points on the slope
have only a single study, and so this computation is less transparent than that for the single
population (or subgroups) but the concept is the same.
The practical implications of using a random-effects model rather than a fixed-effects model are the
same for all three cases (simple analysis, subgroups, and regression). To wit,
•
•
•
The random-effects model will lead to more moderate weights being assigned to each study. As
compared with a fixed-effect model, the random-effects model will assign more weight to small
studies and less weight to large studies.
Under the random-effects model, the confidence interval about each coefficient (and slope) will
be wider than it would be under the fixed-effect model.
Under the random-effects model, the p-values corresponding to each coefficient and to the
model as a whole are less likely (on average) to meet the criterion for statistical significance.
26
These points are evident in Figure 4 and Figure 5, which show the regression of log risk ratio on latitude
using fixed-effect and random-effects weights, respectively.
Figure 4 | Regression of log risk ratio on latitude | Fixed-effect
Figure 5 | Regression of log risk ratio on latitude | Random-effects
27
Relative weights
Under the fixed-effect model (Figure 4) the study weights tend to be more extreme, with large studies
getting substantially more weight than small studies. Under the random-effects model (Figure 5), the
study weights tend to be more moderate, with relatively small differences between the studies.
The selection of a model will have a substantial impact on the regression line if studies that fall outside
the pattern happen to be especially large. These studies will tend to pull the regression line towards
themselves, and this will be more important under fixed effect weights than under random-effect
weights. (Conversely, a small study will have more impact on the regression line when we apply randomeffects weights). In the BCG example the larger studies tend to fall within the pattern of the others, and
so the regression lines are similar in the two plots.
Absolute weights
Under the fixed-effect model there is only one source of sampling variance (within-study). Therefore the
weights (which are the inverse of the variance) are relatively large, which yields a relatively narrow
confidence interval (Figure 4). Under the random-effects model there is an additional source of
sampling variance (between-study). Therefore the weights are smaller, which yields a wider confidence
interval (Figure 5).
28
PUTTING REGRESSION IN CONTEXT
In primary studies, we sometimes think of analyses as belonging to one of three types.
•
•
•
Case A. Simple analysis, where the goal is to estimate the mean
Case B. Analysis of variance, where the goal is to estimate the mean effect in two (or more)
subgroups of subjects, and then see if (and how) the mean varies by subgroup
Case C. Regression, where the goal is to estimate the mean effect for subjects that share the
same values on one (or more) covariates, and then see if (and how) the mean varies as a
function of the covariate values.
In fact, though, regression is a general system that includes all three of these cases. In other words, we
can use regression not only in Case C but also in Cases A and B. If we did so, we would get the identical
answers using regression that we get using a simple analysis or analysis of variance.
The same holds true for meta-analysis. While we tend to employ meta-regression only for Case C, we
also have the option to use it for Case A or B. We will take advantage of this fact to help explain metaregression. Specifically,
•
•
•
For Case A we will perform an analysis using the traditional approach and then using regression,
to show the correspondence between the two.
For Case B we will perform a subgroups analysis using the traditional approach and then a
regression, to show the correspondence between the two.
For Case C, there is no simpler approach, and so we will move directly to the regression.
The goals of a fixed-effect analysis are different than the goals of a random-effects analysis, and for that
reason we will address each separately. We’ll run through this sequence of cases (A, B, C) using the
fixed-effect model, and then for the random-effects model.
In the text that follows we focus on conceptual issues. The statistical formulas that underlie these issues
are presented in Appendix 2: Understanding Q and Appendix 3: Tests of heterogeneity.
29
FIXED-EFFECT MODEL
Basic analysis (Case A)
Here, we present a meta-analysis of the BCG studies. This is a basic meta-analysis in the sense that our
goal is simply to estimate the mean effect size for the full set of studies. We will perform this exercise
(1) using the traditional approach and then (2) using meta-regression, to show the correspondence
between the two.
Figure 6 and Figure 7 are screen-shots from CMA using the traditional approach to meta-analysis. Note
that the effect size is in log units [A]. Each row shows the effect size and confidence interval for one
study.
The lines marked “Fixed” in Figure 6 [B] and in Figure 7 [C] show the summary (common) effect size as
−0.4303, with a 95% confidence interval of −0.5097 to −0.3509. The Z-value for a test of the null is
−10.6247 with a corresponding p-value of < 0.0001.
A
Figure 6 | Basic analysis | Fixed-effect | Log risk ratio
B
30
Test of effect size
Is the effect size zero?
The line labeled “Fixed” in Figure 6 [B] and Figure 7 [C] show the effect size is −0.4303 with a standard
error of 0.0405. The Z-value for a test of the null is −10.6247 with a corresponding p-value of < 0.0001.
We would conclude that the common effect size is probably not zero.
C
D
Figure 7 | Basic analysis | Fixed-effect | Log risk ratio
Test of the statistical model
Is the data consistent with the fixed-effect model?
In Figure 7, the section labeled Heterogeneity [D] presents statistics that address the heterogeneity in
effect size. The Q-value is 153.2330 with 12 degrees freedom and a corresponding p-value under
0.0001. This tells us that the true effect size probably varies across studies, which means that the data
are not consistent with the assumptions of the fixed-effect model.
The regression approach
We can perform the same analysis using meta-regression. Figure 8 shows the screen in CMA where we
define the model. Since our goal here is to estimate the common effect size (that is, the intercept) we
have included no covariates except for the intercept [E]. The results are shown in Figure 9.
E
Figure 8 | Regression setup | Intercept only
31
F
G
H
I
Figure 9 | Regression | Main results | Fixed-effect | Intercept only
Test of effect size
Is the effect size zero?
Since there are no covariates, the predicted effect is simply the intercept, and so the question “Is the
effect size zero?” is addressed by a test of the intercept.
In Figure 9, the regression equation [F] gives the predicted effect size for all studies as Y=−0.4303 with a
standard error of 0.0405, variance of 0.0016, and confidence interval of −0.5097 to −0.3509. The Zvalue for a test of the null is −10.6247 with a corresponding p-value of < 0.0001. We would conclude
that the effect size is probably not zero. Note that these numbers match the numbers from the
traditional analysis in Figure 6 [B] and Figure 7 [C].
Analysis of variance
In Figure 9, results for the fixed-effect regression are presented using an analysis of variance framework,
where the total [I] weighted sum of squares (WSS, or Q) is partitioned into the part explained by the
predictive model [G] and the residual [H].
Model
The line labeled “Model” [G] displays the WSS explained by the predictive model. Since there are no
covariates in this example, this line has no relevance here. The Q-value and df are displayed as 0.0, and
the p-value as 1.0.
32
Residual
The line labeled “Residual” [H] displays the WSS not explained by the model and tests the hypothesis
that all studies share a common (true) effect size. Since the Q value is 152.2330 with df = 12 and p <
0.0001 we conclude that the true effect size probably varies across studies. Thus, the data are not
consistent with the assumptions of the fixed-effect model. Note that the numbers are identical to those
in Figure 7 [D].
Total
The line labeled “Total” [I] displays the total WSS for the full set of studies (with no predictors). In this
case the total WSS is the same as the residual WSS, since both are based on the variance across all 13
studies. The Q value is 152.2330 with df = 12 and p < 0.0001. Again, the numbers are identical to those
in Figure 7 [D].
Summary
Our goal here was to show the correspondence between a traditional analysis and a regression for a
simple analysis.
•
In a meta-analysis with no covariates we want to estimate (and test) the effect size. This
question is addressed by the common effect (−0.4303) in the traditional analysis, and by the
intercept (−0.4303) in the regression. In both cases the standard error is 0.0405 and the p-value
is < 0.0001, which tells us that the true effect size is probably not zero.
•
We also want to know if the data are consistent with the fixed-effect model. This question is
addressed by the Q-test for heterogeneity in the traditional analysis, and by the Q-test for the
residual in the regression. In both cases the Q-value is 152.2330 with df = 11 and p < 0.0001.
This tells us that the true effect size probably varies across studies, and so the assumptions of
the fixed-effect model have been violated.
Subgroups analysis (Case B)
Above, we found that the impact of the vaccine varied from study to study. The researchers
hypothesized that this variation might be explained by the fact that studies were conducted in various
locations, and that the vaccine was more effective in colder climates. To test this hypothesis we can
classify each study as being either “Cold” or “Hot” based on its latitude, and then perform an analysis (a)
to estimate the effect size in each subgroup of studies, and (b) to compare the effect size for the two
subgroups. (This differs from the original analysis, where the researchers used latitude as a continuous
covariate rather than creating two groups).
In Figure 10 the studies have been divided into subgroups as follows.
33
In Figure 10 the studies have been divided into subgroups.
•
•
The six “Cold” studies are at the top, followed by their common effect, a log risk ratio of −0.9986
[A].
The seven “Hot” studies are at the bottom, followed by their common effect, a log risk ratio of
−0.1115 [B].
A
B
Figure 10 | Subgroups Cold vs. Hot | Fixed-effect
The same statistics are also shown in the “Fixed effect” section of Figure 11. The lines labeled [A] and
[B] in Figure 11 correspond to the lines labeled [A] and [B] in Figure 10.
Is the common effect size zero for each subgroup?
For the cold studies this is addressed by Figure 10 [A]. The same numbers are displayed in Figure 11 [A].
The effect size is −0.9986 with a standard error of 0.0676, variance of 0.0046, and confidence interval of
−1.1310 to −0.8662. The Z-value for a test of the null is −14.7808 with a corresponding p-value of <
0.0001. We would conclude that the common effect size for studies in cold climates is probably not
zero.
For the hot studies this is addressed by Figure 10 [B]. The same numbers are displayed in Figure 11 [B].
The effect size is −0.1115 with a standard error of 0.0506, variance of 0.0026, and confidence interval of
−0.2107 to −0.0124. The Z-value for a test of the null is −2.2042 with a corresponding p-value of 0.0275.
We would conclude that the common effect size for studies in hot climates is probably not zero.
34
C
A
B
D
E
F
Figure 11 | Subgroups Cold vs. Hot | Fixed-effect
Analysis of variance
In Figure 11, the section labeled Heterogeneity [C] shows how the total variance is partitioned into its
component parts. Where analysis of variance in a primary study is based on sums of squares (SS), the
subgroups analysis in a meta-analysis is based on weighted sums of squares (WSS, called Q). Still, the
basic idea is the same. The total Q can be partitioned into its component parts – the Q explained by the
subgroups and the Q within subgroups (and thus unexplained, or residual).
Total within
The fixed-effect model requires that all studies within the same subgroup share the same true effect
size. This assumption is tested by the Q-statistic where Q and it degrees of freedom are computed
within subgroups and then summed across subgroups. Here, Q = 41.7894, df = 11, and p < 0.0001. This
tells us that effects probably do vary within subgroups, and the fixed-effect model is not valid [D].
Total between
The line labeled “Total between” is a test of the predictive model. Here, it addresses the question “Does
effect size vary by subgroup?” The Q-value of 110.4436 with df = 1 and p < 0.0001 tells us that it
probably does vary by subgroup [E].
Overall
The line labeled “Overall” reflects the total dispersion. It addresses the question “Do the effects vary
from each other if we ignore the subgroups and compute the variance of all studies about the grand
mean?” The Q-value of 152.2330 with df = 12 and p < 0.0001 tells us that they probably do vary [F].
Note that this is the same question we asked in the simple analysis (with no subgroups), and so it
follows that the statistics for this section (Overall) are identical to those we saw for the simple analysis in
Figure 7 [D].
35
Note that the variance components are additive. The Q-value within the Hot studies plus the Q-value
within the Cold studies yields the total Q-value within subgroups. Then, the Q-value within subgroups
plus the Q-value between-subgroups yields the total Q-value.
The regression approach
We can perform the same analysis as a meta-regression.
Figure 12 shows the screen where we define the model.
•
•
The first covariate is the intercept.
The second covariate is the variable called Climate: Hot. This covariate will address the question
of whether or not the effect size varies by climate. The sub-designation (Hot) follows the
convention that variables are named and coded for the presence of an attribute. Since the
variable is called “Hot”, Cold studies will be coded 0 while Hot studies will be coded 1.
Figure 12 | Regression Cold vs. Hot | Setup
36
Results are displayed in Figure 13
I
J
K
L
Figure 13 | Regression Cold vs. Hot | Fixed-effect
Analysis of variance
In Figure 13 the Analysis of variance [I] shows how the total WSS [L] is partitioned into its component
parts – the WSS explained by the Model (here, subgroups) [J] and the Residual WSS (here, within
subgroups) [K].
Model
The line labeled “Model” [J] asks if the predictive model (climate) explains any of the variance in effect
size. Put another way, it asks if the dispersion of effects about the regression line is smaller when the
regression line is based on climate rather than based solely on the grand mean. The analysis shows that
Q = 110.4436 with df = 1 and p < 0.0001, so we conclude that the predictive model probably explains (at
least) some of the variance in effect size.
Residual
The line labeled “Residual” [K] asks if the data are consistent with the model’s assumption of a common
effect size for all studies with the same climate. The Q value is 41.7894 with df = 11 and p < 0.0001. We
conclude that the data are not consistent with the assumptions of the fixed-effect model.
37
Total
The line labeled “Total” [L] asks if the between-study variance for the full set of studies (with no
subgroups) is zero. This analysis is the same whether or not there are subgroups and so, as in the prior
analysis, Q = 152.2330, df = 12, p < 0.0001.
Note that the variance components are additive. The Q-value for the residual plus the Q-value for the
model yields the total Q-value.
Prediction equation
The prediction equation [I] is −0.9986 + 0.8870 x Climate.
Since Climate is coded 0 for Cold and 1 for Hot, the predicted value for Cold studies is
Y =−0.9986 + 0 × 0.8870 =−0.9986 ,
(1.1)
while the predicted value for Hot studies is
Y =−0.9986 + 1× 0.8870 =−0.1116 .
(1.2)
Note that these are the same numbers we saw in the subgroups analysis.
•
•
Figure 10 [A] and Figure 11 [A] showed the mean effect size for the Cold studies as −0.9986,
which is the same number we see in (1.1).
Figure 10 [B] and Figure 11 [B] showed the mean effect size for the Hot studies as −0.1196,
which is the same number we see in (1.2).
38
Summary
The total Q of each effect size about the grand mean can be partitioned into its component parts – the Q
due to the variation in effect size that can be explained by subgroup membership, and the part that
cannot. The traditional approach and the regression approach use somewhat different nomenclatures
but are identical mathematically and yield precisely the same answers.
•
•
•
•
The Q-between (in the traditional model) and the Q-model (in the regression) are both 110.4436
with df = 1 and p < 0.0001. Each tells us that effect size probably differs between subgroups.
The Q-within (in the traditional model) and the Q-residual (in the regression) are both 41.7894
with df = 11 and p < 0.0001. Each tells us that the assumptions of the fixed-effect model have
been violated.
The Q-total in each case is 152.2330 with df = 12 and p < 0.0001. Each tells us that effect sizes
vary when we ignore subgroups and work with deviations of all studies from the grand mean.
The Q-values are additive. Q-between plus Q-within equals Q-total.
Continuous covariate (Case C)
Immediately above (Case B) we divided the studies into “Hot” or “Cold” climates, which allowed us to
perform a subgroups analysis. In the original paper, the researchers worked with the absolute latitude of
each study as a continuous covariate. We turn to that analysis now.
There is no mechanism to work with a continuous covariate in the traditional framework.
The regression approach
Figure 14 shows the screen where we define the regression. We will use intercept and Latitude to
predict the effect size. The covariate will address the question of whether or not effect size is related to
latitude. Figure 15 shows the results of this analysis.
Figure 14 | Regression | Latitude | Setup
39
A
B
C
D
Figure 15 | Regression | Latitude | Fixed-effect
Analysis of variance
In the section labeled Analysis of variance [A] the WSS-total is partitioned into its component parts – the
WSS explained by latitude (Model) and the WSS-residual.
Model
The lines labeled “Model” [B] addresses the hypothesis that the predictive model (latitude) explains any
of the variance in effect size. Put another way, it asks if the dispersion of effects about the regression
line is smaller when the regression line is based on latitude rather than based solely on the grand mean.
Since Q = 121.4999 with df = 1 and p < 0.0001, we conclude that the predictive model probably explains
(at least) some of the variance in effect size.
Residual
The line labeled “Residual” [C] addresses the hypothesis that the data are consistent with the model’s
assumption of a common effect size for all studies at the same latitude. The Q value is 30.7331 with df =
11 and p = 0.0012. We conclude that the data are not consistent with the assumptions of the fixedeffect model. Rather, the true effect size does vary from study to study, even for studies at the same
latitude.
40
Total
The line labeled “Total” [D] addresses the hypothesis that the variance for the full set of studies (with no
predictors) is zero. This analysis is the same whether or not there are subgroups or covariates and so, as
in the prior analyses, Q = 152.2330, df = 12, p < 0.0001.
Summary
The total Q of each effect size about the grand mean can be partitioned into its component parts – the Q
due to the variation in effect size that can be explained by latitude, and the part that cannot.
•
•
•
The Q-value for the model is Q = 121.4999 with df = 1 and p < 0.0001, which tells us that effect
size is related to latitude.
The Q-value for the residual is 30.7331 with df = 11 and p = 0.0012, which tells us that the
assumptions of the fixed-effect model have been violated.
The Q-value for the total is 152.2330 with df = 12 and p < 0.0001, which tells us that that effect
sizes vary when we ignore latitude and work with deviations of all studies from the grand mean.
In context
We presented three cases to show the correspondence between regression and a traditional analysis. In
Case A there were no covariates; in Case B there was a categorical covariate; and in Case C there was
one continuous covariate.
For the traditional analysis the fixed-effect model requires that all studies share the same true effect
size (Case A), or that all studies within a subgroup share the same effect size (Case B). This assumption is
tested by the Q value based on deviations from the grand mean (Case A), or by the Q value based on
deviations from each study’s subgroup mean (Case B).
The regression model is a more general model and allows us to cover all cases by saying that the effect
size must be identical for all studies with the same predicted value. In Case A the predicted value is the
grand mean (as it was for the traditional analysis). In Case B the predicted value is the subgroup mean
(as it was for the traditional analysis). In Case C the predicted value is the point on the regression line
corresponding to the regression equation.
41
RANDOM-EFFECTS MODEL
Immediately above, we showed how to interpret the regression under the fixed-effect model. Now, we
show how to interpret the same regression under the random-effects model. Many of the statistics are
different but, more fundamentally, many of the questions addressed by the analysis are different.
Basic analysis (Case A)
Here, we present a meta-analysis of the BCG studies. This is a basic meta-analysis in the sense that our
goal is simply to estimate the mean effect size for the full set of studies. We will perform this exercise
(1) using a traditional approach and then (2) using meta-regression, to show the correspondence
between the two.
Figure 16 is a screen-shot from CMA using the traditional approach to meta-analysis. Note that the
effect size is in log units [A]. Each row shows the effect size and confidence interval for one study.
The line marked “Random” in Figure 16 [B] shows the summary (mean) effect size as −0.7141, with a
95% confidence interval of −1.064 to −0.3638. The Z-value for a test of the null is −3.9952, and the
corresponding p-value is 0.0001. The same numbers are displayed in Figure 17 [C]
A
A
Figure 16 | Basic analysis | Log risk ratio | Random-effects
B
42
C
D
E
Figure 17 | Basic analysis | Log risk ratio | Random-effects
Test of effect size
Is the mean effect size zero?
The line labeled “Random” in Figure 16 [B] and Figure 17 [C] shows that the effect size is −0.7141 with a
standard error of 0.1787. The Z-value for a test of the null is -3.9952 with a corresponding p-value of
less than 0.0001. We would conclude that the mean effect size is probably not zero.
Heterogeneity
Is there any unexplained variance in the true effect sizes?
The mean effect size is −0.7141. Is it possible that all the observed variance about this mean reflects
sampling error, or is there evidence that some of this variance reflects differences in the true effect size
across studies?
This is addressed in Figure 17 by the section labeled “Heterogeneity” [D]. The Q-value is 152.2330 with
df = 11 and p < 0.0001. This tells us that it’s unlikely that all of the variance is due to sampling error. We
conclude that the true effect size probably does vary from study to study.
Note that the heterogeneity section for the random-effects analysis (Figure 17) is identical to the one for
the fixed-effect analysis (Figure 7). In both cases the heterogeneity statistics are based on the weights
1/V (that is, where the only sampling error is within-studies). While the numbers are identical for the
fixed-effect and the random-effects models, the interpretation differs. Under the fixed-effect model,
the presence of heterogeneity in the true effects tells us that the statistical model does not match the
data. Under the random-effects model, by contrast, this heterogeneity is employed to estimate T2,
which is then incorporated into the weights assigned to each study.
How much variance is there?
This is addressed in the section labeled “Tau-squared” in Figure 17 [E].
The between-studies variance (T2) is estimated as 0.3088. The between-studies standard deviation T is
simply the square root of T2, or 0.5557.
What proportion of the observed variance is true variance?
43
Some of the observed variance is due to real differences in effect size, while some reflects sampling
error. The I2 statistic [D] reflects the proportion of variance that is due to real differences (and thus
potentially explainable by covariates). In this case I2 is 92.1173%, which means that almost all of the
observed variance reflects real differences in study effects.
The regression approach
We can perform the same analysis using meta-regression. Figure 18 shows the screen in CMA where we
define the model [F]. We’ve used no covariates except for the intercept.
F
Figure 18 | Regression | Intercept | Setup
44
Figure 19 shows the results of this analysis.
G
H
I
J
K
Figure 19 | Regression | Intercept | Main results | Random-effects
Test of effect size
Is the mean effect size zero?
Since there are no covariates, the predicted effect is simply the intercept, and so the question “Is the
mean effect size zero?” is addressed by a test of the intercept.
In Figure 19 the regression equation [G] gives the predicted effect size for all studies as Y=−0.7141 with a
standard error of 0.1787 and confidence interval of −1.0644 to −0.3638. The Z-value for a test of the
null is -3.9952 with a corresponding p-value of < 0.0001. We would conclude that the mean effect size is
probably not zero. Note that these numbers match the numbers in Figure 16[B] and Figure 17 [C].
Test of the model
The line labeled Model [H] addresses the hypothesis that the covariates explain any of the variance in
effect size. Since there are no covariates in this model, this section is not relevant. The Q-value is
shown as 0.0, the df as 0, and the p-value as 1.0.
45
Goodness of fit
Is there any unexplained variance in the true effect sizes?
The predicted effect size for each study is simply the intercept, −0.7141. Is it possible that all the
observed variance about the mean reflects sampling error, or is there evidence that some of this
variance reflects differences in the true effect size across studies? This line [I] is called “Goodness of fit”
since the presence of true variance means (by definition) that some variance remains unexplained. That
is, the prediction model does not “Fit” (fully explain) the variance in effect sizes.
The Q-value is 152.2330, with df = 11 and p < 0.0001. We conclude that the true effect size probably
does vary from study to study. These statistics are identical to those in Figure 17 [D] for the traditional
analysis.
How much variance is there?
The between-studies variance (T2) is estimated as 0.3088. The between-studies standard deviation (T) is
then the square root of T2, or 0.5557. These values correspond to the values in Figure 17 [E].
What proportion of the observed variance is true variance?
Some of the observed variance is due to real differences in effect size, while some reflects sampling
error. The I2 value [I] reflects the proportion of variance that is due to real differences (and thus
potentially explainable by covariates). In this case I2 is 92.1173%, which means that almost all of the
observed variance reflects real differences in study effects. This corresponds to the value in Figure 17
[D].
Graphic
In Figure 20 we’ve plotted all 13 studies, and we’ve also plotted the regression line [L]. Since there are
no covariates in the predictive model, the regression line is horizontal. That is, the predicted value for
every one of the studies is the intercept (which is also the mean) of −0.7141. The Q-statistic was
computed by working with the deviation of every study from this predicted value. Note that this graphic
applies to both the traditional analysis (where the predicted value is the mean) and the regression
(where the predicted value is the intercept) since these values are identical.
46
M
L
N
Figure 20 | Dispersion of effects about grand mean
In addition to plotting the mean treatment effect [L] we can also plot the distribution of (true) treatment
mean of −0.7141 and a standard deviation (T) of 0.5557, we would expect some 95% of all true effects
to fall in the approximate range of −1.8033 to 0.3751 (that is, the mean +/− 1.96 T). This range is
represented by the normal curve superimposed on the plot, which is centered at −0.7141 and extends
from −1.8033 [N] to 0.3751 [M]. The figure is drawn to scale, and we can see that the curve captures
almost all of the effect sizes.
Note
The curve is intended to capture most of the dispersion in true effects, not in observed effects. As it
happens, in this example the observed effects and the true effects are very similar (I2 is 92%, which
means that almost all of the observed dispersion is real) and for that reason, most of the effects fall
within the curve. However, this will not always be the case. For example, suppose that I2 had been 25%.
In this case the true dispersion would have been less, and the curve would have been smaller. While the
curve would still capture most of the dispersion in true effects, many of observed effects would fall
outside the curve.
Comparison of Model 1 with the null model
Returning to Figure 19, the goal of section [J] is to estimate R2, the proportion of between-study
variance (T2) explained by the model. For this purpose we need two estimates of T2 – with predictors [I]
and without predictors [J]. We will then use these two numbers to compute R2 [K].
Total between-study variance
The line labeled [J] displays the unexplained between-study variance (T2) when there are no predictors
in the model, which is 0.3088. This is the same value displayed in the traditional analysis (Figure 17 [E]).
47
Line [J] gives the original T2 while line [I] gives the residual T2 after we’ve entered all the covariates. In
this example, since there are no covariates in the model, the two values are identical. When there are
covariates in the model, the estimate of T2 on line [J] may be less than the estimate on line [I], and the
difference would be used to compute R2, the proportion of the original variance explained by the
covariates.
Proportion of variance explained
This section of the screen [K] is used to report what proportion of the total variance can be explained by
the predictive model. In this example there are no covariates. Therefore, the estimate of T2 on line [J]
is the same as the estimate on line [I] and R2 is shown as 0.00.
Summary
Our goal here was to show the correspondence between a traditional analysis and a regression for a
simple analysis.
•
In a meta-analysis with no covariates we want to estimate (and test) the effect size. This
question is addressed by the mean effect (−0.7141) in the traditional analysis, and by the
intercept (−0.7141) in the regression. In both cases the standard error is 0.1787 and the p-value
is 0.0001, which tells us that the true mean effect size is probably not zero.
•
We also want to know if there is evidence that the true effect size varies across studies. This
question is addressed by the Q-test for heterogeneity in the traditional analysis, and by the Qtest for the residual in the regression. In both cases the Q-value is 152.2330 with df = 11 and p <
0.0001, which tells us that the true effect sizes probably do vary.
•
Finally, we want to estimate the variance in true effect sizes. This estimate, called T2, is 0.3088
in both cases. This variation is incorporated into the model, and affects the weights assigned to
each study.
Subgroups analysis (Case B)
Immediately above (Case A) we found that the impact of the vaccine varied from study to study. The
researchers hypothesized that this variation might be explained by the fact that studies were conducted
in various locations, and that the vaccine was more effective in colder climates. To test this hypothesis
we can classify each study as being either “Hot” or “Cold” and then perform an analysis (a) to estimate
the effect size in each subgroup of studies, and (b) to compare the effect size for the two subgroups.
(The researchers used latitude as a continuous covariate rather than creating two groups).
48
In Figure 21 the studies have been divided into subgroups.
•
•
The six “Cold” studies are at the top, followed by their mean effect, a log risk ratio of −1.1987
[A].
The seven “Hot” studies are at the bottom, followed by their mean effect, a log risk ratio of
−0.2784 [B].
A
B
Figure 21 | Subgroups Cold vs. Hot | Random-effects
The same information is displayed in Figure 22. Here, the top section is labeled “Fixed-effect” and
reports statistics based on fixed-effect weights. The bottom one is labeled “Mixed-effects” and reports
statistics based on random-effects weights within subgroups. (The label “Mixed-effects” refers to the
fact that we use the random-effects model within subgroups, but not between subgroups.)
In Figure 22 we are working with the section labeled “Mixed effects” (which means that we’re using
random-effects weights within subgroups). The lines marked [A] and [B] correspond to lines [A] and [B]
in Figure 21, and show the mean effect size, standard error, variance, and confidence interval for the
two subgroups.
49
D
A
B
Figure 22 | Subgroups Cold vs. Hot | Random-effects
D
C
Note on computing T2
CMA offers options for computing T2 in the presence of subgroups. Select the option to compute T2 within subgroups and
then pool the estimates across subgroups, as shown in
Figure 23 and Figure 24. For an explanation of these options, see Appendix 4: Computing τ2 in the
presence of subgroups.
To apply this option in CMA, on the analysis screen
•
•
•
Select Computational options > Mixed and random effects options
Select the option to Assume a common among-study variance across subgroups
Select the option to Combine subgroup using a fixed-effect model
Figure 23 | Option for computing T2 in the presence of subgroups
50
Figure 24 | Option for computing T2 in the presence of subgroups
Is the mean effect size zero for each subgroup?
For the Cold studies this is addressed by Figure 21 [A] and Figure 22 [A]. The mean effect size is −1.1987
with a standard error of 0.1769, and confidence interval of −1.5445 to −0.8518. The test that the mean
is zero is addressed by the z-value of -6.7740 and corresponding p-value of < 0.0001. We would
conclude that the mean effect size in the universe of Cold studies is probably not zero.
For the Hot studies this is addressed by Figure 21 [B] and Figure 22 [B]. The mean effect size is −0.2784
with a standard error of 0.1522, and confidence interval of −0.5767 to +0.0199. The test that the mean
is zero is addressed by the z-value of −1.9289 and corresponding p-value of < 0.0674. This fails to meet
the traditional criterion alpha of 0.05, and so by this criterion we cannot reject the null that the effect
size is zero (that the vaccine has no impact).
Test of the model
Is effect size related to subgroup membership?
This question is addressed by the section labeled “Mixed-effects analysis”. The lines marked Cold [A]
and Hot [B] give the mean effect size for each group, based on random-effects weights. The line marked
“Total between” is a test of the difference between these two values (−1.1987 vs. −0.2784). The Q-value
for this difference is 15.5445 with 1 df, with a corresponding p-value of 0.0001 [C]. We conclude that
effect size probably does differ by subgroups.
Heterogeneity
Is there any unexplained variance in the true effect sizes?
51
Immediately above, we saw that we can use information about a study’s subgroup to improve our ability
to predict that study’s effect. That is, by using the subgroup mean rather than the grand mean to
predict a study’s effect size, we are able to make a more accurate prediction. But does subgroup
membership enable us to completely predict that study’s effect – do all studies within a subgroup share
a common effect size? Or, is there variance in true effects within subgroups?
To test the assumption that there is no variance in true effect sizes within subgroups we compute Q and
df within subgroups and then sum these values across subgroups as shown in Table 2. This table is taken
from the section labeled [D] in Figure 22.
Table 2
Cold
Hot
Total
Q
20.3464
21.4431
41.7894
df
5
6
11
p-value
0.0011
0.0015
0.0000
For Q = 41.7894 with 11 df, the p-value is < 0.0001. This tells us that the true effect sizes do vary from
study to study, even within subgroups. Put another way, the model is incomplete – knowing whether a
study falls into the Cold or Hot subgroup does not allow us to completely predict its effect size.
How much variance is there?
The Q-values computed immediately above are employed to estimate the variance of true effect sizes
(T2) within subgroups. For the Cold subgroup T2 is 0.1383. For the Hot subgroup T2 is 0.0741. The
combined estimate (computed within subgroups and combined across subgroups) is not shown on this
screen, but is 0.0964 (see Appendix 4: Computing τ2 in the presence of subgroups). In each case the
standard deviation of true effect sizes (T) is the square root of the variance. The combined estimate of T
is 0.3105.
What proportion of the observed variance is true variance?
Some of the observed variance within subgroups is due to real differences in effect size, while some
reflects sampling error. The I2 value reflects the proportion of this variance that is due to real
differences (and thus potentially explainable by covariates). For Cold this value is 75.4256%, and for Hot
this value is 72.0189%. The combined estimate is not shown on this screen, but is 73.6775%. This means
that most of the within-subgroup variance reflects real differences in study effects (see Appendix 4:
Computing τ2 in the presence of subgroups).
The regression approach
We can perform the same analysis as a meta-regression. Figure 25 shows the screen where we define
the model.
•
•
The first covariate is the intercept.
The second covariate is the variable called Climate: Hot. This covariate will address the question
of whether or not the effect size varies by climate. The sub-designation (Hot) follows the
52
convention that variables are named and coded for the presence of an attribute. Since the
variable is called “Hot”, Cold studies will be coded 0 while Hot studies will be coded 1.
H
Figure 25 | Regression | Climate | Setup
I
J
K
L
M
Figure 26 | Regression | Climate | Main results | Random-effects
Figure 26 shows the results of the analysis.
53
It is important to note that the results shown on this page are collated from two separate analyses.
•
We run one regression with the intercept and climate as predictors. This is the basis for the
sections which report on the impact of each covariate [I], the test of the model [J], and the
goodness of fit [K].
•
We run a second regression with only the intercept. This is the basis for the section that reports
the value of T2 with no covariates (that is, the true variance about the grand mean) [L].
•
Then we use the estimate of T2 with covariates from [I] and without covariates [L] to compute
the proportion of variance explained, or R2 in section [M].
Test of the model
Is effect size related to subgroup membership?
The question of whether or not effect size is related to subgroup membership is addressed by the
section labeled “Test of the model” [J]. The Q-value is 15.5445, and with df = 1 the p-value is 0.0001.
These are the same values that we saw in Figure 22 [E]. We conclude that effect size does differ by
subgroup.
Equivalently (since there is only one covariate in the model), the Z-value for climate is 3.9426 with a pvalue of 0.0001. (The test of climate is based on Z, which is a standardized difference. The test of the
model is based on Q, which is a squared index. When there is only one covariate, Z2 is equal to Q. Here,
3.94262 is equal to 15.5445).
Goodness of Fit
Is there any unexplained variance in the true effect sizes?
Immediately above, we saw that we can use information about a study’s subgroup to improve our ability
to predict that study’s effect. But does this information enable us to completely predict that study’s
effect – do all studies within a subgroup share a common effect size? Or is there variance in true effects
within subgroups? This is called a Goodness of fit test, since we can say that the model provides a good
for the effects if there is no evidence of unexplained heterogeneity.
To address this question we compute Q working with the deviation of each study from its predicted
effect, which is −1.1987 for the Cold studies and −0.2784 for the Hot studies. Computed in this way, Q is
41.7894 with 11 df and the corresponding p-value is less than 0.0001 [K]. This tells us that the true
effect size varies from study to study, even within subgroups. Put another way, the model is incomplete
– knowing whether a study falls into the Cold or Hot subgroup does not allow us to completely predict
its effect size. This is the same value we saw in the traditional analysis Figure 22 [G].
How much variance is there?
54
In this same section [K] the program shows that T2, the variance of true effect sizes about the subgroup
mean, is 0.0964. It follows that T, the standard deviation of true effect sizes about the subgroup mean is
0.3105. These values refer to the dispersion of true effects within each of the subgroups, and are
assumed to be the same for all subgroups.
What proportion of the observed variance is true variance?
The I2 statistic [K] is 73.6775%, which means that nearly three-fourths of the observed variance that
remains (that is, within subgroups) reflects real differences in study effects.
Note that the program reports I2 for two separate analyses. The one on line [L] is the proportion of the
total variance that represents between-study (true) variance, and can potentially be explained by studylevel covariates. The one on line [K] is the proportion of the within-subgroups variance that represents
between-study (true) variance, and can potentially be explained by study-level covariates.
Graphic
In Figure 27 we’ve plotted all 13 studies, and the regression line. (While the regression line is actually a
line that intersects the Cold and Hot columns at specific points, we’ve taken the liberty of drawing a
horizontal line at the points of intersection.)
The predicted value for the Hot studies is −0.2784 with a standard deviation of 0.3105. If we assume
that the true effects are normally distributed about each predicted value we would expect the true
effects for Hot studies to fall in the range of −0.2784 plus/minus 1.96 times 0.3105, or −0.8870 to
0.3302. In Figure 27 we have superimposed a normal curve on the study points to reflect this span of
true effects [N].
The predicted value for the Cold studies is −1.1199 with a standard deviation of 0.3105. If we assume
that the true effects are normally distributed about each predicted value we would expect the true
effects for Cold studies to fall in the range of −1.1199 plus/minus 1.96 times 0.3105, or −1.7285 to
−0.5113. In Figure 27 we have superimposed a normal curve on the study points to reflect this span of
true effects [O].
Note that the plot shows the observed effects. By contrast, the curves are intended to capture some
95% of the true effects, which are assumed to fall closer to the predicted values.
55
N
O
Figure 27 | Dispersion of effects about the subgroup means
Comparison of Model 1 with the null model
Returning to Figure 26, the intent of the section labeled “Comparison of Model 1 with the null model” is
to report the proportion of variance explained by the model, an index analogous to R2 in primary
regression.
The index is a ratio of the explained variance to the total variance, and to get both numbers we need to
run two separate regressions.
To get the initial amount of variance we run a regression with no covariates and compute T2. Here, T2 is
0.3088, which is the variance of all studies about the grand mean [L]
To get the variance that remains with the covariates, we run a regression with the covariates and
compute T2. Here, T2 is 0.0964, which is the variance of all studies about the regression line [K]
Proportion of variance explained [M]
If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.0964, then the ratio
TRe2 sidual 0.0964
=
= 0.6878
2
0.3088
TTotal
(1.3)
gives us the proportion of variance that is not explained by the covariates. R2, the proportion of
variance that is explained by the covariates is then
T2
1 −  Re 2sidual
R2 =
 TTotal

 0.0964 
1− 
0.6878 .
=
=
 0.3088 

(1.4)
We show this graphically in Figure 28, which juxtaposes Figure 20 with Figure 27. At left, the normal
curve [P] reflects the unexplained variance in effects when the predicted value for each study is the
grand mean. At right, the normal curves [Q,R] represent the variance in effects when the predicted
56
value for each study is the corresponding subgroup mean. This is the variance not explained by
subgroup membership. The variance at the right (0.0964) is less than the variance at the left (0.3088),
which tells us that by using climate as a covariate we can reduce the unexplained variance – or
(equivalently) explain some of the variance.
0.0964
0.3088
Q
0.0964
P
R
Figure 28 | Dispersion about grand mean vs. dispersion about subgroup means
Equation (1.3) gives the ratio of the variance at right to the variance at left (the ratio of not explained to
total). Then in equation (1.4) we subtract this value from 1.0 to get the value of R2.
An equivalent approach to computing R2 is as follows. If the initial (total) T2 is 0.3088 and the
unexplained (residual) T2 is 0.0964, it follows that the difference (0.2124) is the T2 explained by the
model. Then we can compute R2, the proportion explained by the model, as
R2
=
2
TExplained
0.2124
=
= 0.6878 .
2
TTotal
0.3088
(1.5)
Prediction equation
The prediction equation [I] is −1.1987 + 0.9203 x Climate.
Since Climate is coded 0 for Cold and 1 for Hot, the predicted value for Cold studies is
Y =−1.1987 + 0 × 0.9203 =−1.1987 ,
(1.6)
while the predicted value for Hot studies is
Y =−1.1987 + 1× 0.9203 =−0.2784 .
(1.7)
These are the same values that we saw for the subgroup means in Figure 22 [C] and [D].
57
Summary
The Q statistic
•
The Q-Between (in the traditional model) and the Q-Model (in the regression) are both 15.5445,
with df = 1 and p = 0.0001. Each tells us that effect size differs between subgroups.
•
The Q-within (in the traditional model) and the Q for goodness of fit (in the regression) are both
41.7894, with df = 11 and p < 0.0001. Each tells us that the true effect size varies, even within
subgroups.
•
The Q-total in each case is 152.2330, with df = 12 and p < 0.0001. Each tells us that effect sizes
vary when we ignore subgroups and work with deviations of all studies from the grand mean.
The I2 statistic
•
The I2 statistic tells us what proportion of the variation in observed effects reflects variance in
true effects rather than sampling error.
•
When there are no covariates [L] the I2 value is 92.1173%, which tells us that 92% of the
observed variance is real, and may potentially be explained by covariates.
•
When we use climate as a covariate [K] the I2 value is 73.68%, which tells us that some 74% of
the observed variance about the subgroup means is real, and may potentially be explained by
The R2 Index
•
The between-study variance is estimated at 0.0964 within subgroups, as compared to 0.3088 for
the population as a whole. It follows that the variance (in log units) explained by subgroups is
0.2124. The ratio of explained/total corresponds to an R2 of 0.6878, meaning that 68.7824% of
the variance in true effects can be explained by climate. This is reflected in Figure 28, where the
range of true effects about subgroup means is smaller than the range of true effects about the
grand mean.
Continuous covariate (Case C)
In the prior analysis we classified each study as either “Cold” or “Hot” based on its absolute distance
from the equator. In the original paper, the researchers did not classify the studies as Cold or Hot.
Rather, they worked with the absolute latitude of each study as a continuous covariate. We turn to that
analysis now.
There is no mechanism to work with a continuous covariate in the traditional framework.
58
The regression approach
Figure 29 shows the screen where we define the regression using the intercept and latitude to predict
the effect size. Figure 30 shows the results of this analysis.
Figure 29 | Regression | Latitude | Setup
A
B
C
D
E
Figure 30 | Regression | Latitude | Main results | Random-effects
59
As before, it is important to note that the results shown on this page are collated from two separate
analyses.
•
We run one regression with the intercept and climate as predictors. This is the basis for the
sections which report on the impact of each covariate [A], the test of the model [B], and the
goodness of fit [C].
•
We run a second regression with only the intercept. This is the basis for the section that reports
the value of T2 with no covariates (that is, the true variance about the grand mean) [D].
•
Then we use the estimate of T2 with covariates from [C] and without covariates [D] to compute
the proportion of variance explained, or R2 in section [E].
Prediction equation
At the top [A] the program shows the coefficient for the intercept and for each covariate, along with the
standard error, confidence interval, and significance test.
Test of the model
Is effect size related to latitude?
In Figure 30 the prediction equation [A] is 0.2595 −0.0292 x Latitude.
This question of whether or not effect size is related to latitude is addressed by the section labeled “Test
of the model” [B]. The Q-value is 18.8452 with 1 df and p-value of < 0.0001. We conclude that effect
size probably is related to latitude.
Equivalently (since there is only one covariate in the model), the Z-value for latitude is −4.3411 with a pvalue of < 0.0001. (The test of latitude is based on Z, which is a standardized difference. The test of the
model is based on Q, which is a squared index. When there is only one covariate, Z2 is equal to Q. Here,
-4.34112 is equal to 18.8452).
Goodness of fit
Is there any unexplained variance in the true effect sizes?
Immediately above, we saw that we can use information about a study’s latitude to improve our ability
to predict that study’s effect. But does this information enable us to completely predict that study’s
effect? That is, do all studies at the same latitude share a common effect size? Or is there variance in
true effects among studies at the same latitude?
This is addressed by the Goodness of Fit [C]. To compute a measure of dispersion we work with the
deviation of each study from that study’s predicted effect size, where the predicted effect size is a
function of each study’s latitude. Computed as a deviation from this predicted value, Q is 30.7331 with
11 df, and the corresponding p-value is 0.0012. This tells us that the true effect size varies from study to
study, even within latitudes. Put another way, the model is incomplete – knowing a study’s latitude
60
does not allow us to completely predict its effect size. (Unlike subgroups, we may not have multiple
studies at the same latitude, but the idea is the same – for each study we compute the deviation from
the prediction line to the observed effect size).
How much variance is there?
The program [C] shows that T2, the variance of true effect sizes at any point on the regression line, is
0.0633. It follows that the T, the standard deviation of true effect sizes at any point on the regression
line is 0.2516.
What proportion of the observed variance is true variance?
•
The I2 statistic tells us what proportion of the variation in observed effects reflects variance in
true effects rather than sampling error.
•
When there are no covariates [D] the I2 value is 92.1173%, which tells us that 92% of the
observed variance is real, and may potentially be explained by covariates.
•
When we use latitude as a covariate [C] the I2 value is 64.21%, which tells us that some 64% of
the observed variance about the regression line is real, and may potentially be explained by
Graphic
In Figure 31 we’ve plotted all 13 studies and the regression line. The Q-statistic was computed by
working with the deviation of every study from the regression line.
F
G
H
Figure 31 | Dispersion of effects about regression line for latitude
61
The estimate of the variance (T2) is 0.0633 and of the standard deviation (T) is 0.2516. If we assume that
these effects are normally distributed about each predicted value we would expect the true effects for
all studies to fall at the predicted value plus/minus 1.96 T, or within 0.4931 on either side of the
predicted value. This holds true for any point on the regression curve, but for illustrative purposes we
have superimposed a normal curve at a few arbitrary points on the regression line [F,G,H] to reflect this
range.
Note that the plot shows the observed effects. By contrast, the curves are intended to capture some
95% of the true effects, which are assumed to fall closer to the regression line.
Comparison of Model 1 with the null model
Returning to Figure 30, the intent of the section labeled “Comparison of Model 1 with the null model” is
to report the proportion of variance explained by the model, an index analogous to R2 in primary
regression.
The index is a ratio of the variance explained to the total variance, and to get both numbers we need to
run two separate regressions.
To get the initial amount of variance we run a regression with no covariates and compute T2. Here, T2 is
0.3088, which is the variance of all studies about the grand mean [C]
To get the variance that remains with the covariates, we run a regression with the covariates and
compute T2. Here, T2 is 0.0633, which is the variance of all studies about the regression line [D]
Proportion of variance explained
To get the final amount of variance we run a regression with the covariates and compute T2. This value,
reported above [C] as 0.0633, is the variance of studies about their predicted value.
If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.0633, then the ratio
TRe2 sidual 0.0633
=
= 0.7950
2
0.3088
TTotal
(1.8)
gives us the proportion of variance that is not explained by the covariates. R2, the proportion of
variance that is explained by the covariates is then
T2
R2 =
1 −  Re 2sidual
 TTotal

 0.0633 
1− 
0.7950
=
=
 0.3088 

(1.9)
We show this graphically in Figure 32, which juxtaposes Figure 20 with Figure 31. At left, the normal
curve [I] reflects the unexplained variance in true effects when the predicted value for each study is the
grand mean. At right, the normal curves [J, K, L] represent the variance in true effects when the
predicted value for each study is the corresponding point on the regression line. This is the variance not
62
explained by latitude. The variance at the right (0.0637) is less than the variance at the left (0.3088),
which tells us that by using latitude as a covariate we can reduce the unexplained variance – or
(equivalently) explain some of the variance.
0.3088
J
K
I
L
0.0633
Figure 32 | Dispersion about grand mean vs. dispersion about regression line
An equivalent approach to computing R2 is as follows. If the initial (total) T2 is 0.3088 and the
unexplained (residual) T2 is 0.0633, it follows that the difference (0.2455) is the T2 explained by the
model. Then we can compute R2, the proportion explained by the model, as
2
TExplained
0.2455
R
=
=
= 0.7950
2
TTotal
0.3088
2
(1.10)
Summary
The Q statistic
•
The Q-Model is 18.8452 with df = 1 and p < 0.0001. This tells us that effect size is related to
latitude.
•
The Q-value for goodness of fit is 30.7331 with df = 12 and p = 0.0012. This tells us that the true
effect size varies, even within studies at the same latitude.
•
The Q-total is 152.2330 with df = 12 and p < 0.0001. This tells us that effect sizes vary when we
ignore latitude and work with deviations of all studies from the grand mean.
The I2 statistic
•
The variance in effect sizes that is observed within a given latitude is partly due to real
differences (which can potentially be explained by additional study-level covariates) and partly
due to within-study sampling error. When there are no covariates [D] the I2 value is 92.1173%,
which tells us that 92% of the observed variance is real, and may potentially be explained by
63
covariates. When we use latitude as a covariate the I2 [C] value is 64.2080%, which tells us that
some 64% of the remaining variance (F, G, H in Figure 31] is real, and may potentially be
The R2 statistic
•
The between-study variance is estimated at 0.0633 at any given point on the regression line
based on latitude, as compared to 0.3088 for the regression line based on the grand mean. This
corresponds to an R2 of 0.7950, meaning that 79.50% of the true variance in effects can be
explained by latitude.
In context
We presented three cases to show the correspondence between regression and a traditional analysis. In
Case A there were no covariates; in Case B there was a categorical covariate; and in Case C there was
one continuous covariate.
In Case A we tested the effect size by looking at the mean (for the traditional analysis) or the intercept
(for the regression). In Case B we looked at the relationship between effect size and subgroup (for the
traditional analysis), or between the effect size and the covariate (for regression). In Case C we looked
at the relationship between effect size and the covariate.
For the traditional analysis, to estimate the variance in true effect sizes we computed the Q value based
on deviations from the grand mean (Case A); or by the Q value based on deviations from each study’s
subgroup mean (Case B).
The regression model is a more general model and allows us to cover all cases by saying that we
computed the Q-value based on deviations from each study’s predicted value. In Case A the predicted
value is the grand mean (as it was for the traditional analysis). In Case B the predicted value is the
subgroup mean (as it was for the traditional analysis). In Case C the predicted value is the point on the
regression line corresponding to the regression equation.
In any case, once we had Q we used it to estimate the true variance for the relevant population (all
studies, studies within subgroups, or studies at the same latitude). In addition to estimating the
variance of effects (T2) and the standard deviation of effects (T), we were able to report what proportion
of the observed variance was real (I2) and what proportion of the original variance was explained by the
predictive model (R2).
64
PART 6: META-REGRESSION IN CMA
65
WHAT’S NEW IN THIS VERSION OF META-REGRESSION?
•
•
The prior version of CMA (Version 2) included a module to perform simple regression (one
covariate).
The current version (Version 3) incorporates a full-fledged regression module, which allows for
any number of covariates.
Additionally, the new module includes an array of sophisticated options, including the following.
•
•
•
•
•
Define “Sets” of variables, such as a set of covariates that together capture the impact of a
categorical variable, or a set of covariates that represent the linear, curvilinear, and cubic
relationship of dose with effect size.
Automatically create and code dummy variables for categorical covariates.
Select from an array of computational options, including the choice to use the Z distribution or
the Knapp-Hartung adjustment, to use method of moments (MM), full maximum likelihood
(ML), or restricted maximum likelihood (REML) for estimating τ2.
The regression plot incorporates many options, including the ability to display both confidence
intervals and prediction intervals for the regression line.
Export data and residuals to Excel™ for further processing.
•
The program allows you to define two or more prediction models. For example, define one
model that includes a series of nuisance variables and another that includes these variables plus
variables that represent the treatment. The program displays the proportion of variance
explained by each model and also a test that compares models.
66
THE COVARIATES AND THE PREDICTIVE MODEL
In the prior chapter our goal was to show the correspondence between meta-regression and a
traditional meta-analysis. For that reason we included no more than one predictor in each analysis.
Of course, a key strength of meta-regression is that it allows us to include two or more predictors in an
analysis, as we will be using multiple predictors in most of the examples that follow.
The interpretation of a meta-regression is basically the same as that of regression in a primary study.
The analysis will yield a set of statistics for each covariate, as well as set of statistics for the model. The
statistics for each covariate reflect the impact of that covariate, with all other covariates held constant.
The statistics for the full model reflect the combined impact of all covariates.
•
If we have covariates A and B and want to know the impact of each covariate ignoring the other
(that is, ignoring the potential confound) we would run two analyses. The first analysis would
include only A. The second would include only B.
•
If we have covariates A and B, and we want to know the impact of each with the other held
constant, we would run one analysis that includes both A and B. The statistics for each covariate
reflect the impact of that covariate with all other covariates partialled, or held constant. The
statistics for the model reflect the contribution of A and B as a set.
•
If we have covariates A and B, and we want to know the impact of each with the other held
constant, and also assess the interaction A x B, we would run one analysis that includes A, B, and
AB. The statistics for AB give the impact of the interaction over and above the main effects. The
statistics for the model reflect the impact of the two main effects plus the interaction.
67
QUICK START
1) On the data-entry screen
a) Create a column for study name
b) Create a set of columns for the effect size
c) Identify one or more columns as “Moderators” and set the subtype to either “Integer”,
“Decimal”, or “Categorical
d) Enter the data
Or, simply start CMA and then open the BCG file. Be sure to use the file BCGP if your computer
uses a period to indicate decimal places, or BCGC if it uses a comma for this purpose.
2) On the main analysis screen
a) Optionally, select the effect size index
b) Optionally, select the studies to be included in the regression
c) Optionally, specify how to work with studies that included multiple subgroups, outcomes, timepoint, or comparisons.
d) Click Analyses > Meta-regression 2
3) On the regression screen – define the regression
a) Select the covariates to be included in the model
b) Optionally, define “Sets” of covariates
c) Optionally, define multiple models
d) Optionally, select [Computational options]
e) Run the analysis
4) On the regression screen – navigate the results
a) Click the [Fixed] or [Random] tab at the bottom of the screen to select the model
b) Click the model name (when several models have been created)
c) Use the toolbar to move between the main analysis screen, the scatterplot, diagnostics,
increments, model comparisons, and other screens
5) On the regression screen – save the analysis (Optionally)
6) On the regression screen – export the results (Optionally)
In this manual we use the BCG data as the motivating example.
68
STEP 1: ENTER THE DATA
Insert column for study names
In Figure 33 [A], click Insert > Column for > Study names.
A
Figure 33 | Data-entry | Step 01
Figure 34 [B], the program creates a column labeled “Study name”.
B
Figure 34 | Data-entry | Step 02
69
Insert columns for effect size data
In Figure 35, click Insert > Column for > Effect size data.
C
Figure 35 | Data-entry | Step 03
The program opens a wizard (Figure 36) that allows you to specify the kind of summary data you will
enter
•
•
Select <Show all 100 formats> [D]
Click [Next] [E]
D
E
Figure 36 | Data-entry | Step 04
70
In the wizard (Figure 37)
•
•
Select the top option button [F]
On this screen, Click [Next] [G]
F
G
Figure 37 | Data-entry | Step 05
71
In Figure 38, drill down to
•
•
•
Dichotomous (number of events)
Unmatched groups, prospective (e.g., controlled trials, cohort studies)
Events and sample size in each group [H]
Then, click <Finish>
Note that we will be entering events and sample size (N) for each group. Some of the texts that use the
BCG example report events and non-events rather than events and N.
H
Figure 38 | Data-entry | Step 06
72
The program creates columns as shown in Figure 39. It also opens a wizard that allows you to label the
columns.
•
•
Enter Vaccine/Control as names for the two groups [I]
Enter TB/Ok as names for the two outcomes [J]
Then, click [Ok]
I
J
Figure 39 | Data-entry | Step 07
The program applies the labels as shown in Figure 40 [K].
K
Figure 40 | Data-entry | Step 08
73
Insert columns for moderators (covariates)
Next, we need to create columns for the moderator variables. As shown in Figure 41,
•
Click Insert > Column for > Moderator variable [L]
L
Figure 41 | Data-entry | Step 09
74
The program opens a wizard (Figure 42)
•
•
•
Set the variable name to “Latitude” [M]
Set the column function to Moderator [N]
Set the data type to Integer [O]
Then, click [Ok]
M
N
O
Figure 42 | Data-entry | Step 10
75
As shown in Figure 43, Click Insert > Column for > Moderator variable
•
•
•
Set the variable name to “Year”
Set the column function to Moderator
Set the data type to Integer
(This is the year the study was performed, not the year of publication)
Then, click [Ok]
Figure 43 | Data-entry | Step 11
76
As shown in Figure 44, Click Insert > Column for > Moderator variable
•
•
•
Set the variable name to “Allocation”
Set the column function to [Moderator]
Set the data type to [Categorical]
This moderator tracks the mechanism utilized to assign people to be vaccinated (or not). The
possibilities are random, alternate, and systematic.
Then, click [Ok]
Figure 44 | Data-entry | Step 12
77
As shown in Figure 45, Click Insert > Column for > Moderator variable
•
•
•
Set the variable name to “Climate”
Set the column function to [Moderator]
Set the data type to [Categorical]
This moderator tracks the climate. The possibilities are Cold and Hot.
Then, click [OK]
Figure 45 | Data-entry | Step 13
78
Customize the screen
The program initially displays the odds ratio (Figure 46).
•
•
We want to work with the risk ratio rather than the odds ratio.
Additionally, we want to display the risk ratio in log units.
Therefore, we need to customize the display as follows.
•
•
Right-click in any yellow column
Click <Customize computed effect size display> [A]
A
Figure 46 | Data-entry | Step 14
79
The program displays this wizard (Figure 47)
•
•
Tick the box for Risk ratio [B]
Tick the box for Log risk ratio [B]
B
Figure 47 | Data-entry | Step 15
80
As shown in Figure 48, we can set Log risk ratio as the default effect size, and also hide the odds ratio
•
•
•
In the drop-down box, select Log risk ratio as the primary index [C]
Un-check the box for odds ratio [D]
Un-check the box for log odds ratio [D]
Then click [Ok]
C
D
Figure 48 | Data-entry | Step 16
The screen now looks like Figure 49.
Figure 49 | Data-entry | Step 17
81
Enter the data
You can enter the data manually, or copy and paste from Excel ™ or another source (see Appendix 1:
The dataset)
In Figure 50 you enter effect-size data into the white columns [E]. The program automatically computes
the values in the yellow columns [F].
E
F
Figure 50 | Data-entry | Step 18
•
•
You may continue to add the other moderators, as enumerated in the appendix
Or, open the file BCG.cma. There are two versions of this file, one using a period to indicate
decimal places and one using a comma. Use the one that corresponds to your computer’s
settings.
82
STEP 2: RUN THE BASIC META-ANALYSIS
To run the analysis, click [Analysis > Run Analysis] as shown in Figure 51.
G
Figure 51 | Data-entry | Step 19
83
The main analysis screen
The program displays the main analysis screen (Figure 52).
The current effect size [A] is “Log risk ratio”. If you want to switch to another effect size, click [Effect
measure: Log risk ratio] on the toolbar and select an alternate index.
The next few pages outline the main analysis in CMA using the traditional approach. However, this is
optional. As soon as you arrive at the main analysis screen (Figure 52) you can click [Analysis > Metaregression 2] to proceed immediately to the regression module.
The initial meta-analysis
In Figure 52, the <Fixed> tab [B] is selected, so the program is displaying the results for a fixed-effect
analysis [C]. The effect size is in log units [A].
A
C
B
Figure 52 | Basic analysis | Fixed-effect | Log risk ratio
84
Click the tab [D] for <Random>. The program [E] displays results for a random-effects analysis.
D
E
Figure 53 | Basic analysis | Random-effects | Log risk ratio
85
Display moderator variables
Next, we want to display the moderator variables on the plot.
Note that this is optional, and has no effect on the regression.
In Figure 54, click View > Columns > Moderators [F]
F
Figure 54 | Basic analysis | Display moderators
The program displays a list of all variables that had been defined as moderators on the data-entry
screen. Drag and drop each of the following onto the main screen, to the right of the “p-value” column
(Figure 55 [G]). Latitude, Year, Allocation, and Climate.
G
Figure 55 | Basic analysis | Display moderators
86
The screen should now look like Figure 56.
You can right-click on any column and sort by that column. Here, the studies are sorted by latitude.
Since the data had been sorted by latitude on the data-entry screen, the program initially displays the
studies in that sequence. It appears that the effect size is minor (near 0 in log units) for studies in
warmer climates (toward the top) [H] and larger (as extreme as −1.58) for studies in colder climates
(toward the bottom) [I].
H
I
Figure 56 | Basic analysis | Display moderators
87
Display statistics
Click <Next table> to display the statistics shown in Figure 57.
Using random-effects weights [J], the summary log risk ratio is −0.7141. The Z-value is -3.995 with a
corresponding p-value of 0.0001. Thus, we can reject the null hypothesis that log risk ratio is 0.0 (or
equivalently, that the risk ratio is 1.0). If we assume that the studies are valid we can conclude that, the
vaccine (on average) probably does prevent TB.
At the same time, there is also a substantial amount of dispersion in the effect size. Tau-squared [K] is
0.3088 and Tau is 0.5557. To get a general sense of the true dispersion we can assume that the true
effects are balanced about the random-effects estimate of the mean effect, and that some 95% of all
true effects fall within 1.96 T of this mean. Then (in log units) most true effects fall in the range of
−1.8032 to +0.3750. This corresponds to risk ratios of approximately 0.16 (a strongly protective effect)
to 1.46 (a harmful effect).
It would be very important to understand the reason for this dispersion, and for this purpose we turn to
meta-regression.
K
J
Figure 57 | Basic analysis | Display statistics for heterogeneity
88
STEP 3: RUN THE META-REGRESSION
At this point we proceed to the meta-regression.
On the analysis screen (Figure 58), select Analysis > Meta-regression 2.
Note.
If you don’t see any regression option you may have a lite or standard version of the program, rather
than the professional version.
If you see an option for Meta-regression but not Meta-regression 2, you have Version 2 of CMA rather
than Version 3.
Figure 58 | Run regression | Step 01
89
The Interactive Wizard
The program displays the screen shown Figure 59.
The interactive wizard will walk you through all the steps in running the regression. To display or hide
the wizard, use the Help menu.
Figure 59 | Run regression | Step 02
90
When you initially open the regression module the program displays the following
•
•
The main screen [A]
A list of available covariates [B]
A
B
C
Figure 60 | Run regression | Step 03
We need to move the covariates from the wizard [B] onto the main screen [A].
Add variables in the sequence shown here (allocation, year, and latitude) to recreate the example that
we use in this text.
•
•
•
•
•
Click “Allocation” on the wizard
Click [Edit reference group] and select [Random].
Click [Add to main screen] [C]
Click “Year” on the wizard and then click [Add to main screen] [C]
Click “Latitude” on the wizard and then click [Add to main screen] [C]
91
The model is shown in In Figure 61.
Note that “Allocation” is displayed as two lines [D], which are linked by a bracket. Since allocation is a
categorical variable the program automatically creates dummy variables to represent allocation. See
next chapter for a full discussion.
Tick the check-boxes for all covariates [E]
F
D
Figure 61 | Run regression | Step 04
EE
The covariates are controlled by the “Covariates” toolbar [F]. On this toolbar,
•
•
•
•
[Show covariates] shows or hides the wizard
[Remove covariates] allows you to remove a covariate from the main screen
[Move up] and [Move down] allow you to edit the sequence of covariates
The blue and red checks allow you to add (or remove) checks from a series of check-boxes
92
Set computational options
The program allows you to specify various options for the computations
Click Computational options to display the menu in Figure 62.
Figure 62 | Run regression | Step 05
Each of these options is discussed in Part 10: Computational options.
To follow the example in this text, set the options as follows.
•
•
•
•
•
Method for estimating T2 (Method of moments)
One-tailed or two-tailed test for p-values (Two-tailed)
Confidence level (95%)
Display the variance inflation factor (Off)
Z distribution or the Knapp-Hartung adjustment for p-values and confidence intervals (Z)
93
Run the regression
To run the regression, simply click “Run regression” on the toolbar [A] in Figure 62
A
Figure 63 | Run regression | Step 06
94
STEP 4: NAVIGATE THE RESULTS
Main results screen (Fixed effect)
After you run the regression
•
•
Click [Main Results] [A]
Click [Fixed] at the bottom to select the statistical model [B]
A
B
Figure 64 | Main results | Fixed-effect
For a full discussion of how to interpret the output for a fixed-effect analysis, see Part 7: Understanding
the results.
95
Main results screen (Random effects)
After you run the regression
•
•
Click [Main Results] [C]
Click [Random] at the bottom to select the statistical model [D]
C
D
Figure 65 | Main results | Random-effects
For a full discussion of how to interpret the output for a random-effects analysis, see Part 7:
Understanding the results.
96
Difference between the fixed-effect and random-effects displays
Under the fixed-effect model we assume that there is one source of sampling error (within-study
variance), whereas under the random-effects model we allow that there may be two sources of
sampling error (within-study variance and between-study variance). Since the weight assigned to each
study is the inverse of the variance, the weight assigned to each study depends on the model. In the
following pages, this will be evident in the fact that for the statistics that are presented under both
models (such as the effect size and its standard error), the value depends on the statistical model.
Perhaps more fundamentally, the statistical model determines what statistics we choose to present.
The results screen is quite different for the fixed-effect vs. the random-effects model, reflecting the fact
that the model determines what questions we can ask of the data.
When we work with the fixed-effect model we assume that all studies share a common effect size. We
don’t need to estimate T2 (the between-study variance), since this is assumed to be zero. If T2 is
assumed to be zero, we don’t estimate I2 (the ratio of between-study variance to total variance) since
this must be zero. Nor do we estimate R2, the proportion of between-study variance explained by the
predictors, since this must also be zero. For example, see Figure 71.
By contrast, when we work with the random-effects model we allow that the true effect size may vary
from one study to the next, and therefore these statistics help us to understand this variation. We can
estimate the variation of effects sizes (a) about the grand mean and (b) about the regression line. By
comparing the two we can compute R2, the proportion of variance explained by the predictors. We can
also compute I2 for each case (with and without covariates), and this tells us what proportion of the
observed variance reflects variation in true effect sizes rather than random error. For example, see
Figure 76.
97
Plot
To display the plot
•
•
•
Click [Scatterplot] on the menu bar to navigate to the plot [A]
Select [Fixed] or [Random] on the tab at the bottom of the screen [B]
To specify the variable for the X-axis,
o Right-click on the X-axis label [C] or
o Click on the drop-down tool [D]
For a full discussion of the plot see page 157.
D
A
B
C
Figure 66 | Plot
98
Other screens
To navigate to other tables of results, click “More results” [A] in Figure 67 and then select any of the
following.
•
•
•
•
Covariance (see page 132)
Correlation (see page 133)
Diagnostics (see page 125)
R-squared graphic (see page 145)
About the data included in (or excluded from) the analysis
•
•
All studies (see page 244)
Valid studies (see page 244)
Statistics that compare different predictive models
•
•
•
•
Increments (see page 134)
Models summary (see page 262)
Compare models (detailed) (see page 262)
Compare models (p-value) (see page 262)
A
Figure 67 | Other screens
99
STEP 4: SAVE THE ANALYSIS
Once you’ve run a meta-regression you can save the predictive model as shown in Figure 68.
•
•
Click File > Save regression file as … [A]
This will save the regression template with an extension of .cmr.
A
Figure 68 | Save analysis
The .cmr file saves the instructions for the analysis, NOT the data. By analogy, programs such as SPSS™,
SAS™, and stata™ allow you to save a set of commands in one file and the data in another file. The
commands can then be applied to any data file that has the same variables.
•
•
The .cmr file, saved here, is analogous to the command file in the other programs.
The .cma file, saved from the data-entry screen, is analogous to the data file in the other
programs.
The .cmr file saves the following
•
•
•
•
•
The list of covariates
The list of models
The check-boxes for each model
The sets
The model names
In another session you can open a data file on the main data-entry screen. Then, return to the
regression module and click File > Open file to open the .cmr file and re-run the analysis.
The .cmr file can be used with the same dataset that was used to create it, or with another dataset that
includes the same variables. For example,
100
•
•
•
You may return to the main analysis screen and edit the study filters
You may be working with an entirely different data set that has the same variables as the first
one.
In any of these cases, navigate to the regression module and click File > Open to open the .cmr file.
When you open a .cmr file the program simply recreates the main MR screen as though you had entered
it manually. The .cmr file does not save the statistical settings that were in place when the file was
created. These include the method employed to estimate T2, the use of Z or Knapp-Hartung, the
confidence level, the choice of a one-sided or two-sided test.
101
STEP 5: EXPORT THE RESULTS
The program offers two options for exporting the results of any analysis.
•
Export the results to Excel™. Then, you can perform additional computations within Excel™,
and/or format the results and copy them as a table to other programs
•
Copy the results to the clipboard as a picture. Then, paste this picture into Word™ or any other
program.
Figure 69 shows an example for the main analysis screen.
•
•
Click [File > Save results as Excel™ file and open] [A]
Provide a name for the Excel™ file
A
Figure 69 | Export results
102
The program creates the Excel™ file shown in Figure 70.
Figure 70 | Export results
The same idea applies to any screen that displays results.
103
PART 7: UNDERSTANDING THE RESULTS
104
MAIN RESULTS
•
•
•
To run the regression, click Run Regression.
The program will display the Main Results screen
Click [Fixed] or [Random] to select the statistical model
The following pages show the results screen for each statistical model.
The top of the screen is similar for the two models (Figure 72 and Figure 77). For either model it shows
the impact of each covariate with other covariates partialled. The difference between the two screens is
that one is based on fixed-effect weights while the other is based on random-effects weights.
After that, however, the screens differ in some fundamental ways which reflect the difference in the
two statistical models. While this holds true for any predictive model, we’ll use the case of one
categorical covariate (Climate) as an example.
For the fixed-effect model (Figure 72) the program displays a table similar to the analysis of variance
table we see for primary studies. This table includes a row for
a) The model : The WSS for the deviation of the subgroup means from the grand mean
b) The residual : The WSS for the deviation of all studies from their subgroup means
c) The total : The WSS for the deviation of all studies from the grand mean
The program does not present any statistics for between-study variance (T2) nor for proportion of
between-study variance explained by the model (R2). For studies that share the same predicted value,
T2 is assumed to be zero, and so there is no reason to report it. If T2 with the predictive model in place is
assumed to be zero, then R2 also has no real meaning (by definition, once we apply the model, no
between-study variance remains).
By contrast, for the random-effects model (Figure 77) the program does not display an analysis of
variance table. The idea of partitioning the WSS only works if the weight for each study is constant. This
condition is met under the fixed-effect model since the weights are always the same (based on withinstudy variance), but not under the random-effects model since the weights (which are based also on T2)
change when we introduce covariates (and thus the frame of reference for computing T2).
For the random-effects analysis we do want to present statistics for T2, and for R2. To do this, we need
to run a series of distinct analyses and then collate the results. Specifically, we run one analysis to get T2
with the covariates, and another to get T2 without the covariates. The change in T2 gives us the amount
of variance explained by the predictive model, and this value over the original T2 gives us the proportion
of variance explained (R2). To highlight the fact that these statistics are coming from separate analyses,
the statistics are not presented in a table, but rather in separate sections on the screen
105
MAIN RESULTS, FIXED-EFFECT ANALYSIS
To navigate to this screen
Click [Run regression] [A]
A
Figure 71 | Setup
106
The toolbar changes as shown in Figure 72.
•
•
Click “Main results” [B]
Click [Fixed] [C]
B
H
E
F
G
D
C
Figure 72 | Main results | Fixed-effect
107
Test of the model
Analysis of variance
In section [D] the total WSS (Weighted sum of squares, Q) is partitioned into the following.
Model
The model [E] is the test that the predictive model explains any of the variance in effect size. Put
another way, it asks if the dispersion of effects about the regression line is smaller when the regression
line is based on the covariates rather than based solely on the grand mean. Here, Q = 128.2186 with df =
4 and p < 0.0001, so we conclude that the predictive model explains (at least) some of the variance in
effect size.
Residual
The residual [F] is the test that the data are consistent with the model’s assumption of a common effect
size for all studies with the same predicted value. The Q value is 24.0144 with df = 8 and p = 0.0023. We
conclude that the data are not consistent with the assumptions of the fixed-effect model.
Total
The total [G] is the test that the variance for the full set of studies (with no predictors) is zero. The Qvalue is 152.2330 with df = 12 and p < 0.0001.
In a primary study, the total sum of squares (SS) is the sum of the SS explained by the model and the SS
residual. Similarly, in a meta-analysis (with fixed-effect weights) the total weighted sum of squares
(WSS) is the sum of the WSS explained by the model and the WSS residual. As shown in Figure 72 [D],
WSST =WSS M + WSS RES =152.2330 =128.2186 + 24.0144
(1.11)
Similarly, the total df is the sum of the model and the residual df (12=4+8).
dfT =
df M + df RES =
12 =
4+8
(1.12)
Impact of individual covariates [H]
In Figure 72, the test of the model [E] is an omnibus test for the full set of covariates. It tells us that the
set as a whole is related to effect size. By contrast, the rows at the top [H] address the unique impact of
each covariate – that is, the impact of each covariate when all of the other covariates are held constant.
Since the effect size is the risk ratio, all analyses are carried out in the log metric and all coefficients are
in the log metric. In this example, virtually all predicted effects are less than zero, so 0 is no effect, −1 is a
large effect, and −2 is a very large effect (see Figure 73). In this example, therefore, a negative
coefficients means that as the covariate gets larger the vaccine is more effective (see Appendix 8:
Interpreting regression coefficients ).
108
To understand the direction of the effect size as a function of covariates, it’s helpful to work with the
scatterplot as discussed immediately below.
Allocation
Allocation type is a categorical covariate with three groups (Randomized, Alternate, and Systematic),
and therefore is represented by a set of two dummy variables. In Figure 72, the test of this set yields Q
= 6.3651 with df = 2 and p = 0.0412. Thus, there is evidence that effect size is related to allocation type.
The relationship between allocation type and effect size with other covariates partialled in displayed in
Figure 73.
For a more specific analysis we can look at each line within the set. Alternate allocation has a coefficient
of 0.6320 (the vaccine is less effective in studies that employed alternate allocation as opposed to
randomized allocation) and a p-value of 0.0366. Systematic allocation has a coefficient of 0.3062 (the
vaccine is less effective in studies that employed systematic allocation as opposed to randomized
allocation). However, as will be discussed in the chapter on caveats (page 293), these findings may be
due to a confound with other factors.
Figure 73 | Plot | Fixed-effect
Click on Scatterplot and select Allocation to produce the plot shown in Figure 73.
There is a column for each allocation type (Random, Alternate, and Systematic). In each column the
program displays the observed effects sizes as well as the summary effect size and the confidence
interval for the summary effect size.
In this example the “Fixed” tab is selected at the bottom of the screen, so all statistics are based on the
fixed-effect model. The reader will note that the summary effect size for the alternate allocation studies
falls outside the range of the actual effect sizes for the two studies in this group. This reflects the fact
109
that these means are adjusted for other covariates (and serves as a caution against performing these
kinds of adjustments with a small number of studies).
Year
The coefficient for Year is +0.0235, which means that for every increase of one year the log risk ratio will
increase by 0.0235 (vaccine was less effective in later trials). The coefficient plus/minus 1.96 times the
standard error (0.0159) yields the 95% confidence interval for the coefficient, which is −0.0076 to
+0.0545. The coefficient divided by its standard error yields a Z value of 1.4795, and the corresponding
p-value of 0.1390. Thus, when latitude and allocation method are held constant, the relationship
between year and effect size is not statistically significant.
Figure 74 | Plot | Year | Fixed-effect
Click on Scatterplot and select Year to produce the plot shown in Figure 74.
The regression line shows that as the Year increases, the effect size moves closer to zero. Since the
effect size is a log risk ratio, this means that the treatment effect (when adjusted for the other
covariates) declined over the years.
The confidence interval shows the range of regression lines that are consistent with the data – in other
words, we could pivot the regression line on its axis either counter-clockwise or clockwise (as suggested
by the arrows) until it encountered the confidence interval. The uncertainty is such that the true
regression line could be either in an upward or downward direction. This corresponds to the p-value of
0.1390 for Year in Figure 72, and the fact that the confidence interval for the coefficient included both
negative and positive values (−0.0076 to +0.0545)
110
Latitude
The coefficient for latitude is −0.0213, which means that for every increase of one unit (degree) in
latitude the log risk ratio will decrease by 0.0213 (vaccine is more effective at greater latitudes). The
coefficient plus/minus 1.96 times the standard error (0.0084) yields the 95% confidence interval for the
coefficient, which is −0.0378 to −0.0048. The coefficient divided by its standard error yields a Z value of
−2.526, and the corresponding p-value of 0.0115. Thus, even when year and allocation method are held
constant, the relationship between latitude and effect size is statistically significant.
Figure 75 | Plot | Latitude | Fixed-effect
Click on [Scatterplot] and select Latitude to produce the plot shown in Figure 75.
The regression line shows that as the absolute Latitude increases, the effect size moves further from
zero. Since the effect size is a log risk ratio, this means that the treatment effect (when adjusted for the
other covariates) increases as we move further from the equator.
The confidence interval shows the range of regression lines that are consistent with the data – in other
words, we could pivot the regression line on its axis either counter-clockwise or clockwise (as suggested
by the arrows) until it encountered the confidence interval. While there is substantial uncertainty, all
likely regression lines are in the same (downward) direction. This corresponds to the p-value of 0.0115
for Lattitude in Figure 72, and the fact that the confidence interval for the coefficient includes only
negative values (−0.0378 to +0.0048)
111
Summary
The model
The total Q of each effect size about the grand mean can be partitioned into its component parts – the Q
due to the variation in effect size that can be explained by the covariates, and the part that cannot.
•
•
•
Model. The Q-value for the model is 128.2186 with df = 4 and p < 0.0001, which tells us that
effect size is related to at least one of the covariates.
Residual. The Q-value for the residual is 24.0144 with df = 8 and p = 0.0023, which tells us that
the assumptions of the fixed-effect model have been violated.
Total. The Q-value for the total is 152.23 with df = 12 and p < 0.0001, which tells us that that
effect sizes vary when we ignore subgroups and work with deviations of all studies from the
grand mean.
Individual covariates
Where the test of the model is an omnibus test for the full set of covariates, the table at the top
addresses the impact of each covariate with all other covariates held constant. with other covariates
held constant,
•
Alternate allocation is associated with a smaller effect size (but see the chapter on caveats). The
p-value for allocation is 0.0415.
•
Studies that fall further from the equator showing more impact of the vaccine. The p-value for
year is 0.1390.
•
Studies that fall further from the equator showing more impact of the vaccine. The p-value for
latitude is 0.0115.
112
MAIN RESULTS, RANDOM-EFFECTS ANALYSIS
To navigate to this screen
Click [Run regression] [A]
A
Figure 76 | Run regression | Setup
113
The toolbar changes as shown here
•
•
Click “Main results” [B]
Click [Random] [C]
B
C
Figure 77 | Main results | Random-effects
114
The results presented in Figure 78 are based on three separate analyses. Each of these analyses yields
specific items of information, which are pulled together on this screen.
D
E
F
G
Figure 78 | Main results | Random-effects
The differences among the three analyses are shown in Table 3.
Table 3
Section
D
E
F
Function
Random-effects estimates
Variance not explained by model
Original variance
Covariates
Yes
Yes
No
Weights
V+T2
V
V
Section D reports statistics for an analysis that employs random-effects weights and includes the
covariates. This provides a test of the model, and is also the analysis used in the table at the top of the
screen.
115
Section E reports statistics for an analysis that includes the covariates but assigned weights based on V.
This provides a goodness-of-fit test. Specifically, we use this analysis to estimate the residual T2, the
variance not explained by the covariates.
Section F reports statistics for an analysis that does not include covariates and assigns weights based on
V. This allows us to estimate the original T2, the total amount of variance.
Section [G] is based on the analyses in sections [E] and [F]. Section [E] gives us the variance that cannot
be explained by the covariates, while section [F] gives us the total variance. We can use these to
compute the ratio of explained to total, which is presented in section [G] (See Part 8: The R2 index).
116
Test of the model [D]
Is effect size related to the covariates?
The test of the model is a simultaneous test that all covariates (except the intercept) are zero. The Qvalue is 13.1742 with df = 4 and p = 0.0105. We reject the null and conclude that at least one of the
covariates is probably related to effect size.
Goodness of fit [E]
Is there any unexplained variance in the true effect sizes?
Immediately above, we saw that the covariates improve our ability to predict that study’s effect. But
does this information enable us to completely predict that study’s effect – do all studies with the same
values on all covariates share a common effect size? Or is there variance in true effects among studies
with the same predicted value?
The Q statistic, based on the deviation of each study from its predicted value, is 24.0144, with 8 df and a
corresponding p-value of 0.0023. This tells us that the true effect size probably varies from study to
study, even for studies that are identical on all covariates. Put another way, the model is incomplete –
knowing a study’s allocation type, year, and latitude does not allow us to completely predict its effect
size.
How much variance is there?
In section [E] the program shows that T2, the variance of true effect sizes at any point on the regression
line, is 0.1194. It follows that the T, the standard deviation of true effect sizes at any point on the
regression line is 0.3455. We can use this to get a sense of how closely the true effects at any point on
the regression line are (or are not) clustered together.
In Figure 31 we’ve plotted all 13 studies, the regression line, and a series of normal curves about the
regression line. Each normal curve is centered at some point on the regression line, and extends 1.96 T
on either side of that line. If the true effects are normally distributed with standard deviation T, then
95% of studies with that predicted value will have a true effect size within the range of the normal
curve.
For example, consider the normal curve labeled [N]. The regression line crosses the Y axis at −1.5. If we
were to run many studies at this latitude, the mean effect in these studies would be −1.5. However, the
true effect size in any single study would typically fall somewhere above or below this value. The normal
curve tells us that 95% of these studies would have true effects in the range indicated by the curve,
approximately from −1.0 to −2.0.
The decision to display normal curves at three specific points is arbitrary. These curves could have been
placed at any points on the regression line.
117
L
M
N
Figure 79 | Dispersion of effects about regression line for latitude
What proportion of the observed variance is true variance?
The variance of the observed effects about the regression line incorporates both within-study variance
(error) and between-study variance (that can be potentially explained by additional study-level
covariates). The I2 statistic [E] is 66.69%, which tells us that some 67% of the variance of observed
effects about the regression line falls into the latter group.
A useful way of using I2 is to help us understand what the distribution of effects would look like if we
could plot the true effects rather than the observed effects. An I2 of 67% tells us that the variance of the
distribution would shrink by about one-third.
The problem with this number is that it’s in square units, and it’s not intuitive what it means that the
variance will shrink by a third. It might be more intuitive to work with the square root of I2, (I), which is
0.8166. If we were looking at the true scores rather than the observed scores, the dispersion of effects
about the regression line (in linear units) would shrink by about 18%.
Impact of individual covariates
The test of the model [D] is an omnibus test for the full set of covariates. It tells us that at least one of
the covariates is probably related to effect size. By contrast, the table at the top [F] addresses the
impact of each covariate with all of the other covariates partialled (or held constant).
Since the effect size is the risk ratio, all analyses are carried out in the log metric, and all coefficients are
in the log metric. In this example, virtually all predicted effects are less than zero, so 0 is no effect, −1 is
a large effect, and −2 is a very large effect. In this example, therefore, a negative coefficient means that
as the covariate gets larger the vaccine is more effective. (The reverse would be true if the predicted
values were all positive).
118
To understand the direction of the effect size as a function of covariates, it’s helpful to work with the
scatterplot as discussed immediately below.
Allocation
Allocation type is defined as a set of two covariates. The test of the set yields Q = 1.5402 with df = 2 and
p = 0.46, and so there is no evidence that effect size is related to allocation type.
For a more specific analysis we can look at each line within the set.
•
•
Alternate allocation has a coefficient of 0.4855 (the vaccine is less effective in studies that
employed alternate allocation as opposed to randomized allocation) and a p-value of 0.3127.
Systematic allocation has a coefficient of 0.4574 (the vaccine is less effective in studies that
employed systematic allocation as opposed to randomized allocation) and a p-value of 0.2260.
None of these p-values is statistically significant.
Figure 80 | Plot | Allocation method | Random-effects
Click on Scatterplot and select Allocation to produce the plot shown in Figure 80.
There is a column for each allocation type (Random, Alternate, and Systematic). In each column the
program displays the observed effects sizes as well as the summary effect size and the confidence
interval for the summary effect size.
In this example the “Random” tab is selected at the bottom of the screen, so all statistics are based on
the random-effects model.
119
The predicted effect size for studies that employed alternate allocation (center column) or systematic
allocation (right-hand column) is closer to zero than the predicted effect size for studies that employed
randomized allocation (left-hand column). As above, none of these differences is statistically significant.
Year
The coefficient for Year [H] is 0.0148, which means that for every increase of one year the log risk ratio
will increase by 0.0148 (the vaccine became less effective over time). The corresponding p-value is
0.5225.
Figure 81 | Plot | Year | Random-effects
Click on [Scatterplot] and select Year to produce the plot shown in Figure 81.
The regression line shows that as the Year increases, the effect size moves closer to zero. Since the
effect size is a log risk ratio, this means that the treatment effect (when adjusted for the other
covariates) declined over the years.
The confidence interval shows the range of regression lines that are consistent with the data – in other
words, we could pivot the regression line on its axis either counter-clockwise or clockwise (as suggested
by the arrows) until it encountered the confidence interval. The uncertainty is such that the true
regression line could be either in an upward or downward direction. This corresponds to the p-value of
0.5225 for Year in Figure 78, and the fact that the confidence interval for the coefficient included both
negative and positive values (−0.0306 to +0.0603).
120
Latitude
The coefficient for latitude [I] is −0.0190, which means that for every increase of one unit (degree) in
latitude the log risk ratio will decrease by 0.0190 (vaccine is more effective at greater latitudes). The
coefficient plus/minus 1.96 times the standard error (0.0159) yields the 95% confidence interval for the
coefficient, which is −0.0503 to 0.0122. The coefficient divided by its standard error yields a Z value of
−1.1924, and the corresponding p-value of 0.23. Thus, when year and allocation method are held
constant, the relationship between latitude and effect size not is statistically significant.
Figure 82 | Plot | Latitude | Random-effects
Click on [Scatterplot] and select Latitude to produce the plot shown in Figure 82.
The regression line shows that as the absolute Latitude increases, the effect size moves further from
zero. Since the effect size is a log risk ratio, this means that the treatment effect (when adjusted for the
other covariates) increases as we move further from the equator.
The confidence interval shows the range of regression lines that are consistent with the data – in other
words, we could pivot the regression line on its axis either counter-clockwise or clockwise (as suggested
by the arrows) until it encountered the confidence interval. There is substantial uncertainty in the
coefficient. This corresponds to the p-value of 0.2331 for Lattitude in Figure 78, and the fact that the
confidence interval for the coefficient (−0.0503 to +0.0122) includes the null value, zero.
121
In this example none of the individual covariates has a p-value less than 0.05. Since the model as a
whole is statistically significant, the fact that no covariate is statistically significant probably reflects the
fact that some of the covariates are correlated with each other. For example, latitude or year might be
statistically significant if entered into the equation alone. However, if the two are correlated with each
other and compete to explain the same variance, neither has a unique impact that meets the threshold
for statistical significance.
Comparison of Model 1 with the null model
We want to report what proportion of variance is explained by the predictive model, and for this
purpose we need to know how much variance there was initially (with no covariates). For this reason
we run a regression with no covariates (the null model) and compute T2. Here, T2 is 0.3088, which is the
variance of all studies about the grand mean.
Proportion of variance explained
To get the final amount of variance we run a regression with the covariates and compute T2. This value,
reported above as 0.1194, is the variance of studies about their predicted value.
•
•
•
With no covariates in the model [F] the unexplained variance (T2 ) is 0.3088
With covariates in the model [E] the unexplained variance (T2 ) is 0.1194
The difference between these values is the variance explained by the model, or 0.1894
If the initial (total) T2 is 0.3088 and the unexplained (residual) T2 is 0.1194, then the ratio
TRe2 sidual 0.1194
=
2
0.3088
TTotal
(1.13)
gives us the proportion of variance that is not explained by the covariates. R2, the proportion of
variance that is explained by the covariates is then
 TRe2 sidual 
 0.1194 
R =
1−  2
1− 
0.6133 .
=
=
 0.3088 
 TTotal 
2
(1.14)
In Figure 78 this is on the line labeled [G].
We show this graphically in Figure 83. At left, the normal curve [O] reflects the unexplained variance in
true effects when the predicted value for each study is the grand mean. At right, the normal curves [P,
Q, R] represent the variance in true effects when the predicted value for each study is the corresponding
point on the regression line. This is the variance not explained by latitude. The variance at the right is
less than the variance at the left, which tells us that by using latitude as a covariate we can reduce the
unexplained variance – or (equivalently) explain some of the variance.
122
T2 = 0.1194
T2 = 0.3088
O
P
Q
R
Figure 83 | Dispersion of effects about two regression lines
An equivalent approach to computing R2 is as follows. If the initial (total) T2 is 0.3088 and the
unexplained (residual) T2 is 0.1194, it follows that the difference (0.1894) is the T2 explained by the
model. Then we can compute R2, the proportion explained by the model, as
R2
=
2
TExplained
0.1894
=
= 0.6133 .
2
TTotal
0.3088
(1.15)
Summary
The full model
•
The Q-Model is 18.85 with df = 1 and p < 0.0001. This tells us that effect size is related to (at
least some of) the covariates.
•
The Q-value for goodness of fit is 30.73 with df = 12 and p = 0.0012. This tells us that the effect
size varies, even within studies that share the same value on all covariates.
•
The Q-total is 152.23 with df = 12 and p < 0.0001. This tells us that effect sizes vary when we
ignore the covariates and work with deviations of all studies from the grand mean.
The I2 statistic
•
The observed variance in effect sizes is partly due to real differences and partly due to withinstudy sampling error. When there are no covariates [F] the I2 value is 92%, which tells us that
123
92% of the observed variance is real, and may potentially be explained by covariates. When we
use these covariates [E] the I2 value is 66.69%, which tells us that 66.69% of the remaining
variance is real, and may potentially be explained by additional covariates.
The R2 statistic
•
The between-study variance is estimated at 0.1194 at any given point on the regression line
based on these covariates, as compared to 0.3088 for the regression line based on the grand
mean. This corresponds to an R2 of .6133, meaning that 61.33% of the variance in true effects
can be explained by the covariates.
124
DIAGNOSTICS
To navigate to the diagnostics screen
Run the analysis [A]
A
Figure 84 | Setup
The toolbar changes as shown here
•
•
Click More > Diagnostics [B]
Select the statistical model (Fixed or random) from the tabs at the bottom
B
Figure 85 | Diagnostics
125
Observed value
This is simply the observed effect size
The Predicted Value
The predicted (fitted) value, Tˆi , for the ith study is the value obtained from using the estimated
regression coefficients b0, b1, …, bp and the covariate values for the ith study xi1, …, xip to compute the
value of the effect size predicted for that study by the regression model
Tˆi = b0 + b1 xi1 +  + bp xip .
(1.16)
The Residual
The (unstandardized) residual value, ei, for the ith study is the difference between the observed value
and the fitted value
e=i Ti − Tˆi .
(1.17)
If ei = 0, the fitted value and the observed value are identical (the fitted value is exactly on the regression
line or plane), but if ei is far from 0, the predicted value is not close to the observed value.
In meta-analysis, effect sizes and their fitted values from different studies can have very different
sampling uncertainties (standard errors). This makes it difficult to interpret differences in the
magnitude of residuals from different studies. Standardized or jackknifed residuals attempt to address
this problem of comparability by dividing the residual by its standard error.
Studentized Residual
The studentized residual value, esi, for the ith study is the residual divided by its standard error
esi =
ei
.
SE ( ei )
(1.18)
The standard error of esi is given by
SE ( ei ) =
1 − hi
,
wi
(1.19)
where wi is the weight given to the ith effect size in the analysis, s2 is the weighted residual mean square,
and hi is the leverage of the ith effect size. Therefore the residual divided by its standard error (the ith
studentized residual) is
126
wi
1 − hi
essi = ei
(1.20)
It is important to note that the standard error of the residual depends on both the residual variance
which is determined by the conditional variance of the estimate (and the random effects variance
component in random effects models) and the configuration of predictors (including the values for the
ith study). Studentized residual esi is on a standard scale, so the values from different studies are more
comparable than those of the unstandardized residuals (the ei). If the regression model is correctly
specified, the esi have approximately a normal distribution with unit standard deviation, so that esi
values greater than 2 in absolute value occur only about 5% of the time by chance and values greater
than 2.5 are quite unusual. The actual sampling distribution of esi will often be closer to Student’s tdistribution with k – Q degrees for freedom, so slightly larger reference values (than 2 and 2.5) may be
appropriate for judging extremeness of residuals when k – Q is small.
Jackknifed Residual
The jackknifed residual eji, is similar to the studentized residual in that it is standardized. However the
jackknifed residual is the difference between the observed effect size in the ith study and the fitted value
of the ith study computed with the ith study deleted from the dataset. That is,
eji =
Ti − Tˆ(i )i
SE T − Tˆ
(
i
(i )i
)
,
(1.21)
where Tˆ( i ) i is the fitted value of the ith study computed from all other studies except the ith study. To be
precise
Tˆ(i )i = b(i )0 + b(i )1 xi1 +  + b(i ) p xip ,
(1.22)
where b(i)0, …, b(i)p are the regression coefficients estimated with the ith study removed from the dataset.
The jackknifed residual is designed to better reveal cases where the ith study does not fit the same
model as the other studies. By removing the (potentially distorting impact of the ith study from the
computation of the regression coefficients, the jackknifed residual sometimes makes it easier to see
how different an observed effect size is from what is expected if that study fit the meta-regression
model that is appropriate for all of the other studies.
Let the weight of the ith study computed using the variance component estimate with ith study
removed be denoted wi(i) , then the ith jackknifed residual is equivalent to
eji = ei
wi (i )
1 − hi
.
(1.23)
127
The sampling distribution of the jackknifed residual is similar to that of the studentized residual
(approximately normal) and similar reference values for judging extremeness are appropriate. The
actual sampling distribution of eji will often be closer to Student’s t-distribution with k – Q – 1 degrees
for freedom, so slightly larger reference values (than 2 and 2.5) may be appropriate for judging
extremeness of residuals when k – Q – 1 is small.
Leverage
Leverage is a diagnostic that reveals how much potential influence a particular study can have on the
result of the meta-regression. Let hi be the leverage of the ith study. The values of the leverage are
always between zero and one inclusive, that is, 0 ≤ hi ≤ 1. The sum of the leverages h1 + … + hk = Q,
where Q is the total number of predictors including the intercept (that is Q = p +1 when there is an
intercept in the model and Q = p if there is no intercept). Thus the average value of the leverage is Q/k,
and estimates of regression coefficients are most efficient when all the leverage values are close to Q/k.
If hi = 0, this implies that the fitted (predicted) value of the effect size for the ith study would be the same
even if that study were not part of the data used to estimate the regression coefficients. In one sense,
this implies minimal influence. If hi = 1, this implies that the fitted value of the ith study could not be
estimated without the data from that study, in other words, the fitted value of that study depends
entirely on data from that study. This latter situation is equivalent to saying that there is a regression
coefficient (or linear combination of regression coefficients) whose estimate is determined entirely by
the data from ith study. In other regression contexts, reference value of 2q/k as been suggested as
indicating a study of high leverage.
The term leverage arises from a mechanical analogy. Imagine a scatterplot of the effect size versus one
predictor. In this one predictor situation, the studies that have x (predictor) values that are far from the
center of the data will have high leverage because moving them up or down would have large influence
of the regression slope. When there is more than one predictor, there may be studies whose
combination of predictor values is far from the center in a multivariate sense. The leverage diagnostic
may reveal such multivariate outliers that are not obvious from looking at predictors one at a time.
Cook’s Distance
Cook’s distance, Di, for the ith study is a measure of how much the estimated regression coefficients
change (on the average) when the ith study is deleted from the dataset. Like the studentized and
jackknifed residuals, Di is standardized, but unlike them it is in a squared (distance-squared) metric.
One can think of Di as the squared difference between b the vector of regression coefficient estimates
estimated from all studies and the vector b(i) of regression coefficients estimated from all studies except
the ith study, divided by the variance of b that is
b − b )′ V ( b − b )
(=
−1
=
Di
(i )
(i )
p +1
wi hi ei2
q (1 − hi )
2
,
(1.24)
128
where V is the covariance matrix of b. In other regression contexts, the value 4/(k – Q) been suggested
to help identify studies that have large influence.
DFITTS
DFITTS is a diagnostic that describes the change in the fitted (predicted) value of the ith study that would
arise as a consequence of deleting the ith study from the data to compute the regression coefficients
used to compute the fitted value. DFFITS is defined as
DFFITSi =Tˆi − Tˆ(i )i =ei
wi hi
(1 − hi )
2
,
(1.25)
where
Tˆi = b0 + b1 xi1 +  + bp xip
(1.26)
Tˆ(i )i = b(i )0 + b(i )1 xi1 +  + b(i ) p xip ,
(1.27)
and
where b(i)0, …, b(i)p are the regression coefficients estimated with the ith study removed from the dataset.
Like the jackknifed residual, DFFITS is designed to better reveal cases where the ith study does not fit the
same model as the other studies. By removing the (potentially distorting impact of the ith study from the
computation of the regression coefficients, the jackknifed residual sometimes makes it easier to see
how different an observed effect size is from what is expected if that study fit the meta-regression
model that is appropriate for all of the other studies. In other regression contexts, the reference value
2 q n has been suggested for identifying studies with potentially large impact on fitted values.
Variance
The variance vi of the ith study is the conditional (estimation error) variance of the effect size in the ith
study. Because vi depends on the sample size in each study, vi can vary substantially across studies.
Tau Squared
Tau-squared, τ2, is the estimate of between-studies variance among effect size parameters at any point
on the prediction line. An assumption of the meta-regression is that the true variance of effect sizes is
the same for all values of the covariate.
Sum
129
Sum is the total variance of the ith effect size, which is vi in fixed effect meta-regression or τ2+ vi in
random effects meta-regression.
Weight
The weight of the ith study, wi, is the actual (raw) weight assigned to this study in the analysis, namely
the reciprocal of the total variance, namely wi = 1/vi in fixed effects meta-regression and wi = 1/( τ2+ vi)
in random effects meta-regression.
Percent Weight
Percent weight for the ith study is the percentage of the total weight accorded to study, that is wi divided
by the sum of all study weights.
How to use the diagnostics
Regression diagnostics are designed to be simple checks that reveal important features of the data and
the regression model fitted to that data. However, the multivariate situation is complex, and diagnostics
are typically imperfect.
For example, consider the important feature of colinearity (correlations) among predictors. It is well
known that colinearity can degrade the quality of estimates of regression coefficients by increasing their
sampling uncertainty. This can occur when two predictors are highly correlated but independent of the
others, or when there is a high multiple correlation among predictors (when one predictor is almost a
linear combination of several others). These two situations have different implications for the quality of
regression estimates. In the former case, only the two coefficients corresponding to the correlated
predictors may be poorly estimated. In the latter case, the impact of colinearity may affect more
coefficients. A diagnostic designed to reveal colinearity, in general, may be not be able to distinguish
between the two types of colinearity. On the other hand, producing diagnostics tailored to a myriad of
possible special cases increases the complexity of the suite of diagnostics, defeating the purpose of
simple checks on data and the regression model.
We have implemented a set of diagnostics that have proven most useful in regression problems
generally and adapted them to meta-regression. All of these diagnostics are related, in that they are
different ways of looking at the extent to which the data associated with a study is inconsistent with the
meta-regression model that fits the other studies. They approach the problem in different ways
however.
•
•
•
The leverage and Cook’s distance focus on the impact of a study on the estimated regression
coefficients.
The residual and studentized residual focus on the difference between fitted (predicted) effect
sizes based on all the data and the observed effect size in each study.
DFITTS and the jackknifed residual focus on the difference between fitted values of each effect
size when the regression coefficients are estimated with and without a particular study.
It is important to recognize that, because these diagnostics are closely related, it is not surprising that
the same studies may be flagged by several of the diagnostics as having high impact or influence. In
fact, it more surprising (but not impossible) when a study is flagged by only one of them.
130
The diagnostics should not be used by themselves to exclude studies from inclusion in a meta-analysis.
The diagnostics are intended to help identify studies that have substantial impact on the estimated
regression coefficients. The reference values we have given are not intended to be used like critical
values in a significance test, but as criteria for further evaluation. Just because a study has high impact
on the analysis does not make it incorrect. However, it is useful to know that a certain study has (or a
few studies have) substantial impact on the results. In such cases it is crucial to be sure of the integrity
of the studies with high impact.
It is also important to know that the impact of a study may change when the set of covariates in the
meta-regression is changed or the set of studies is changed (e.g., when a subset of studies is examined).
A study that has high impact may have much less impact when a certain covariate is removed from the
covariate set or a certain study is removed from the dataset.
Variance Inflation Factor
The variance inflation factor VIFj for the jth covariate is a diagnostic designed to provide information
about the colinearity of the covariate set. One of the consequences of colinearity is that it increases the
variance (the square of the standard error of) the regression coefficient estimates. If the standard error
of a regression coefficient estimate is too large, it may be difficult to meaningfully interpret that
estimate. For example, suppose that a particular coefficient expresses the difference between the
average of two groups of standardized mean difference effect sizes, that the coefficient estimate is 1.0
and the standard error of that coefficient is 4. In such a case, it is difficult to draw an informative
conclusion because the results imply a 95 percent confidence interval for the mean difference between
group mean effects of -3 to +5, a range which is consistent with substantially different substantive
conclusions.
The VIFj indicates how much greater the variance of the regression coefficient estimate bj for the jth
covariate is than it would have been if the covariates were totally uncorrelated. A VIF value of 4 for a
particular covariate indicates that the standard error of that regression coefficient is twice as large as it
would have been if the covariate were uncorrelated with all the other covariates.
A high VIFj value for the jth covariate does not necessarily mean that the standard error of the coefficient
of that covariate is too large for the estimate to be meaningful. For example, suppose that a particular
coefficient expresses the difference between the average of two groups of standardized mean
difference effect sizes, that the coefficient estimate is 1.0, and the standard error of that coefficient is
0.1 In such a case, one can still draw an informative conclusion even if VIF = 4 because the results imply
a 95 percent confidence interval for the mean difference between group mean effects of 0.8 to 1.2, a
range which is consistent with substantially the same substantive conclusion of a very large difference
between the group mean effect sizes.
Note that VIFj is not a property of the jth covariate alone, but depends on all of the other covariates as
well. Therefore removing one covariate from the covariate set may change (sometimes drastically) the
VIF values of several other covariates.
131
COVARIANCE
A
B
Figure 86 | Covariance matrix
To understand what these covariances represent, imagine that we draw a sample of studies, run the
regression, and get an estimate of BYear and BLatitude. We repeat this process j times, and each time get
an estimate of BYear and BLatitude. Then, we compute the covariance of BYear with BLatitude over the j
samples. This covariance would be 0.0003 [B]
The same idea applies to all cells in the matrix.
132
CORRELATIONS
A
B
Figure 87 | Correlation matrix
To understand what these correlations represent, imagine that we draw a sample of studies, run the
regression, and get an estimate of BYear and BLatitude. We repeat this process j times, and each time get an
estimate of BYear and BLatitude. Then, we compute the correlation of BYear with BLatitude over the j samples.
This correlation would be 0.8444 [B]
The same idea applies to all cells in the matrix.
When the correlation between two covariates is high (close to 1.0 or close to −1.0), this tells us that the
two are highly confounded, and it is therefore difficult to isolate the unique impact of each. In this
example, this is probably why (using random-effect weights) latitude was statistically significant when
used by itself, but not when used in conjunction with year.
133
INCREMENTS
We use the term “increments” to refer to the process of adding one covariate at a time to the model,
and studying the change in the variance explained. This approach provides some kinds of information
that are not available in a single analysis that includes all covariates. For example, consider the [Main
results] screen shown in Figure 88.
Figure 88 | Main results | Random-effects
The test of the model, goodness of fit, estimates of T2 and R2, apply to the full model (Allocation, Year,
and Latitude). Suppose we want to know these statistics if (a) we include only allocation, (b) we include
allocation and year, and (c) we include allocation, year, and latitude. One way to get this information is
to actually run a series of analyses, adding one covariate at each iteration.
While the idea of running a series of analyses will work, it can be a tedious process, and also requires the
researcher to collate the results of all the analyses. To address this problem the program has
automated the process. When you define a model with covariates X, Y, Z, the program will run an
analysis with X, another with X and Y, and another with X, Y, and Z. It will then collate the results,
showing the key statistics at each iteration. Additionally, it shows the change in T2 and in R2, as well as a
statistical test for the change, at each iteration.
134
To make it clear how the increments work, we’re going to run a series of analyses and present the
results for each. Then we’ll use these to understand the information on the increments screen. In
practice, of course, you would need to run only one model (the one with all the covariates) and then
jump directly to the increments screen.
First, we include only the intercept.
In Figure 89 we have added all the covariates to the main screen, but only the box for the intercept is
ticked, so this will be the only covariate included in the analysis.
Figure 89 | Setup | Intercept only
135
A
B
C
Figure 90 | Main results | Intercept only
With only the intercept in the model (Figure 90),
•
•
•
For variance explained by the model Q = 0.0000, df = 0, p = 1.0000 [A]
For variance unexplained by the model Q = 152.2330, df = 12, p= 0.0000 [B]
R2 for the model is 0.00 [C]
136
In Figure 91 we add tick-marks for the two dummy variables that represent Allocation
Figure 91 | Setup | Intercept + Allocation
A
B
C
Figure 92 | Main results | Intercept + Allocation
With intercept + allocation in the model (Figure 92)
137
•
•
•
For variance explained by the model Q = 1.4349, df = 2, p= 0.4880 [A]
For variance unexplained by the model Q = 132.3676, df = 10,p = 0.0000 [B]
R2 for the model is 0.00 [C]
138
In Figure 93 we add a tick-mark for Year and re-run the analysis
Figure 93 | Setup | Intercept + Allocation + Year
A
B
C
Figure 94 | Main results | Intercept + Allocation + Year
With intercept + allocation + year in the model (Figure 94)
139
•
•
•
For variance explained by the model Q = 10.7159, d f= 3, p=0.0134 [A]
For variance unexplained by the model Q = 30.3951, df = 9, p = 0.0004 [B]
R2 for the model is .56 [C]
140
Finally, in Figure 95, we add a tick mark for latitude and re-run the analysis
Figure 95 | Setup | Intercept + Allocation + Year + Latitude
A
B
C
Figure 96 | Main results | Intercept + Allocation + Year + Latitude
With intercept + allocation + year + latitude in the model (Figure 96)
141
•
•
•
For variance explained by the model Q = 13.1752, df = 4, p = 0.0105 [A]
For variance unexplained by the model Q =24.0144, df = 8,p = 0.0023 [B]
R2 for the model is .6133 [C]
142
Alternatively, we could have jumped directly to the full model, run the analysis, and gone to the
increments page.
•
•
•
Run the analysis that includes all the covariates (Figure 95)
Click More results > Increments
Select the statistical model tab (Fixed or random)
A
C
A
B
AA
CC
AA
Figure 97 | Main results | Intercept + Allocation + Year + Latitude
Every row in this table copies information from a separate analysis.
•
•
•
•
The row labeled Intercept copies information from Figure 90.
The row labeled Allocation (the second row labeled Allocation, since this is a set) copies
information from Figure 92.
The row labeled Year copies information from Figure 94.
The row labeled Latitude copies information from Figure 96.
Each column in this table corresponds to a section in the prior figures
•
•
•
[A] columns copy information such as T2 from [A] in the prior screens
[B] columns copy information about goodness of fit from [B] in the prior screens
[C] columns copy information about R2 from [C] in the prior screens
Additionally, this table presents information about change from one row to the next
143
•
•
[AA] columns show the change in T2 and in the test of significance
[CC] column shows the change in R2
Suppose we want information about the model that includes allocation and year. On the line labeled
Year, we see that T2 is 0.1349, R2 is .5631, the model is statistically significant (Q = 10.72 with df = 3 and
p = 0.0134) but fails to explain all the variance (Q = 30.40, df = 9, p = 0.0004). These statistics were
copied from the analysis in Figure 94. If we return to that figure, we’ll see the same numbers.
The columns labelled [AA] and [CC] are unique to this screen, and address the change as we move from
one model to the next. The column labeled “Change from prior” gives the change in T2 and in R2. The
column labeled “Test of change” is the corresponding test of statistical significance.
For example, consider the line labeled “Year”. The table shows that T2 changed by −0.4247 (which is the
difference between 0.5596 on the prior line and 0.1349 for the current line). It shows that R2 changed
by 56.31% (which is the difference between 0.00% on the prior line and 56.31% on the current line). It
shows the statistical test for the change yields Q = 8.43, df = 1, p = 0.0037.
Note that the test for change corresponds to the impact of each covariate at the point that it is entered
into the model. Thus,
•
The change for Allocation corresponds to the impact of allocation with no covariates held
constant. The p-value of 0.4880 in Figure 97 corresponds to the p-value of 0.4880 in Figure 92.
•
The change for Year corresponds to the impact of Year with allocation held constant. The pvalue of 0.0037 in Figure 97 corresponds to the p-value of 0.0037 in Figure 94.
•
The change for Latitude corresponds to the impact of Latitude with Year and allocation held
constant. The p-value of 0.2331 in Figure 97 corresponds to the p-value of 0.2331 in Figure 96.
Note. The tests in the earlier screens report Z rather than Q. In each case you could square the Z-value
on the earlier figure to get the Q-value in Figure 97.
144
PART 8: THE R2 INDEX
R2 is the proportion of between-studies variance explained by the model. It is analogous to the R2 index
commonly reported for the proportion of variance explained by covariates in primary studies.
Consider the example shown in Figure 98. We’ve used latitude as the sole covariate.
Figure 98 | Setup
145
C
B
A
Figure 99 | Main results | Latitude | Random-effects
146
In Figure 99 [A] the program shows that the proportion of variance explained by latitude is 0.79. Before
turning to the computation, let’s take a moment and get an intuitive sense of what this means.
T2 = 0.3088
T2 = 0.0633
Figure 100 | Dispersion of effects about grand mean vs. dispersion of effects about regression line
Figure 100 includes two plots, as follows.
The left-hand plot shows the dispersion of all studies about a regression line where there are no
covariates. The predicted effect size for each study is the grand mean, and the variance of true effect
sizes about the mean (T2) is 0.3088. We’ve superimposed a normal curve based on T, and it covers
about 95% of all true effects in this population.
The right-hand plot shows the dispersion of all studies about a regression line based on latitude. The
predicted effect size for each study is the regression line, and the variance of true effect sizes about the
regression line (T2) is 0.0633. We’ve superimposed a series of normal curves based on T. At any point
on the regression line, the curve covers about 95% of all true effects for studies at that latitude.
The normal curves in the right-hand plot are smaller than the normal curve in the left-hand plot. This
reflects the fact that by using the regression line at the right to predict effect sizes we are able to make
better predictions. In fact, we are able to reduce the unexplained variance by 79.5%. The R2 value
describes this reduction in unexplained variance.
At this point we can return to the screen in Figure 99.
•
To get the initial amount of variance we run a regression with no covariates and compute T2.
Here, T2 is 0.3088 [B], which is the variance of all studies about the grand mean.
•
To get the final amount of variance we run a regression with the covariates and compute T2.
This value, reported above as 0.0633 [C], is the variance of studies about their predicted value.
•
If the initial T2 is 0.3088 and the remaining T2 is 0.0633, the difference (0.2455) is the T2
explained by the model.
147
•
The program [A] then uses these values to compute R2, the proportion explained by the model,
using either
=
R2
2
TExplained
0.2455
=
= 0.7950 ,
2
0.3088
TTotal
(1.28)
or equivalently,
T2

 0.0633 
R2 =
1 −  Re 2sdiual  =
1− 
0.7950
=
 0.3088 
 TTotal 
The test that R2 is zero in the population is the same as the test of the model. For the model, Q =
18.845, df = 1, p <0.0001 [E], so we can conclude that R2 in the population is probably not zero.
Notes
While the logic of R2 is the same for primary studies and for meta-regression, the actual computation is
different. (In primary studies the computation is based on the observed variance while in regression it is
based on the true variance. In primary studies all observations are given the same weight, while in
regression each study is given a different weight.) For this reason, when used with meta-analysis, the
index is sometimes called the R2 analog rather than R2.
In Figure 100, the normal curves are drawn to capture 95% of the dispersion in true effects, not in
observed effects. The observed variance is assumed to include both within-study variance and
between-study variance (T2). We are concerned only with the latter.
The normal curves correspond to two standard deviations on either side of the regression line. By
contrast, R2 is based the ratio of variances. Therefore, while the variance on the right is 79.5% smaller
than the variance on the left, the normal curve is not 79.5% smaller (the ratio of the variances is not the
same as the ratio of the standard deviations). Nevertheless, the correspondence is close enough for the
purposes of this illustration.
148
THE SCHEMATIC FOR R2
The program features a schematic illustration of R2 (Figure 101).
To navigate to this screen click More > R-squared graphic
Figure 101 | Display R2
Figure 102 | Schematic for R2
In this figure, the bar represents the total variance in effects, T2, which we saw earlier Figure 99 [D] is
0.3088. Note that this is not the observed variance (which includes within-study variance and betweenstudy variance) but rather our estimate of the between-study (true) variance.
The full bar reflects the true variance of all effects about the regression line when there are no
covariates in the equation (that is, the true variance of all effects about the grand mean). The light-blue
and dark-blue parts of the bar reflect the fact that the total variance can be decomposed into parts that
can and cannot be explained by the model.
149
When we include the covariates in the model T2 is 0.0633. This is the variance of true effects about the
regression line, or variance that cannot be explained by the model. This is represented by the light blue
portion of the bar. If we look back at Figure 100, this is the ratio of the curves at right (squared) to the
curves at left (squared).
The dark blue portion of the bar represents the between-study variance that can be explained by the
model, which is shown as 0.2455. We get this value by subtraction. If the unexplained variance with no
covariates is 0.3088 and the unexplained variance with covariates is 0.0633, then the variance explained
by the covariates is the difference, 0.2455.
We define R2as the proportion of variance that is explained by the covariates. This is the proportion of
the bar that is colored dark blue, in this case 0.79.
We can get R2by taking the ratio of the dark blue (explained) to the total, using
=
R2
2
TExplained
0.2455
=
= 0.7950 ,
2
TTotal
0.3088
(1.29)
Which is the formula displayed on the screen.
Alternatively, we can take the ratio of the light blue (not explained) to the total and subtract this from
1.0, using
T2
1 −  Re 2sdiual
R2 =
 TTotal

 0.0633 
1− 
0.7950 .
=
=
 0.3088 

(1.30)
A SEEMING ANOMALY
In both primary studies and meta-analyses R2 is based on two estimates of the variance, i.e., with and
without covariates.
In primary studies both estimates are based on the same data, and so are linked. If one estimate is too
low the other will be too low also, and the computation of R2 will be largely unaffected.
For meta-analysis, by contrast, the situation is different. In a meta-analysis the two estimates of T2 are
based on separate analyses, and it’s possible for one estimate to be low while the other is too high. A
simplified version of the possible outcomes is shown in Table 4.
Table 4
T2 with no covariates
Underestimate T2
Overestimate T2
T2 with covariates
Underestimate T2
Overestimate T2
2
2
R could be accurate (A)
R will be too low (B)
2
2
R will be too high (C)
R could be accurate (D)
150
Consider the situation in Cell (C), where we overestimate the initial variance and then underestimate the
final variance. It seems we have explained more variance than we actually did.
Conversely, consider the situation in Cell (B), where we underestimate the initial variance and then
overestimate the final variance. It seems we have explained less variance than we actually did.
While the error is easiest to see in cells B and C, there will also be error in cells A and D. Even if we
underestimate both variances (Cell A) or underestimate both variances (Cell D), the magnitude of the
errors will invariably differ, which will affect the estimate of R2.
When the true value of R2 is large, these errors are not obvious. If the true value of R2 is .40 and we
underestimate that value by .10, we simply assume that the correct value is 0.30. However, if the true
value of R2 is near zero and we underestimate the value, then our estimate could fall below zero.
For example, suppose the initial and final values of T2 are 0.20 and 0.19. Suppose further than the initial
estimate is low and the second is high, so the observed values are 0.19 and 0.20. In this case it will
appear that the unexplained variance has increased, which would mean that R2 is negative. It’s easiest
to see this in Cell B, but this can also happen in cells A or D if one estimate has more error than the
other, even if both are in the same direction.
Since this must be due to sampling error (a proportion of variance cannot be negative) we simply set the
value of R2 to zero.
151
ASSESSING CHANGE IN THE MODEL
Suppose we run a model with only year as the covariate. Then we run a model with year plus latitude.
Just as we can report statistics for either model as compared with the null model (the intercept only) we
can also report statistics for the second model as compared with the first.
Specifically,
•
•
•
We can test the Q-value for change in the model, to see if latitude adds any improvement in
prediction over and above year.
We can report the change in T2 with the change in the model.
And, we can report the change in R2 with the change in the model.
Latitude to the prediction model.
In Figure 103 we define a prediction model with the Intercept, Year, and Latitude.
Figure 103 | Setup
152
Run the analysis
•
•
•
Click More results > Increments
Select the statistical model tab (Random)
Click More results > Increments
The table of increments (Figure 105) collates the results of three analyses, as follows
•
•
•
Intercept only [D]
Intercept plus year [E]
Intercept, Year, and Latitude [F]
At each iteration the program also displays the change in T2, the change in R2, and the statistical
significance of the change.
Figure 104 | Display increments
153
D
E
F
G
H
Figure 105 | Increments
To assess the statistical significance of the model vs. the null model (or the change in R2 vs. 0.0) we use
the columns at the left (G)
•
T2 with no covariates is 0.3088, which serves as our baseline for computing R2.
•
When we use year as a covariate T2 drops to 0.2377, and R2 is computed as .2303. The test that
R2 is zero is given by Q = 2.21, d f= 1, p = 0.1368.
•
When we use year and latitude as covariates T2 drops to 0.0921 and R2 is computed as .7018.
The test that R2 is zero is given by Q = 14.30, df = 2, p = 0.0008.
To assess the statistical significance of the model vs. the prior model (or the change in R2 vs. the prior R2)
we use the columns at the right (H)
•
T2 with no covariates is 0.3088, which serves as our baseline for computing R2.
•
When we use year as a covariate, T2 drops to 0.2377 and R2 is computed as .2303. The change in
R2 for the second line vs. the first (that is, adding year to the null model) is .2303. The test that
the change is zero given by Q = 2.21, d f= 1, p = 0.1368. For this row, statistics for change versus
154
the prior model are identical to statistics for change versus the null model (since the prior model
is the null model).
•
When we use year and latitude as covariates T2 drops to 0.0921 and R2 is computed as .7018.
The change in R2 for the third line vs. the second (that is, adding latitude to year) is .4715. The
test that the change is zero given by Q = 9.17, df = 1, p = 0.0025.
155
UNDERSTANDING I2
In Figure 106 the program shows results for a regression that included latitude as a covariate. The
screen displays two estimates of I2, as follows
•
•
For a regression with no covariates [A] covariates, I2 is 92.12%
For a regression with covariates [B], I2 is 64.21%
In any meta-analysis, the dispersion of observed effects can be partitioned into two parts. One is the
dispersion of the true effects, and the other is dispersion due to sampling error.
The I2 statistic gives us the ratio of true to total variance. Since we see the variance of the observed
effects, but we care about the variance of the true effects, we can use I2 to serve as a link between the
two. If we start with the variance of the observed effects and multiply this by I2, we get the variance of
the true effects. Put simply, we get a sense of what the dispersion would look like if each study had a
really large sample size (and therefore minimal error).
If I2 is near 100%, then a plot of the true effects would look similar to the plot of the observed effects.
As I2 moves toward 0%, more and more of the observed variance is simply sampling error, and would
disappear if the studies were large enough (and we thus eliminated the sampling error).
In a regression we report estimates for two distinct types of I2.
On row A we report statistics for a regression with no covariates. On this line T2 is the variance of true
effects about the regression line (which here is simply the grand mean). I2 tells us that proportion of the
variance in observed effects about the regression line would remain if all studies had an extremely large
sample size, so that essentially all error was removed.
On row B we report statistics for a regression with covariates. On this line T2 is the variance of true
effects about the regression line (which here is based on latitude). I2 tells us that proportion of the
variance in observed effects about the regression line would remain if all studies had an extremely large
sample size, so that essentially all error was removed.
In both cases, the interpretation of I2 is the same. If we are presented with the variance of observed
effects and we want to know the variance of the true effects, we multiply the former by I2 to get the
latter. Note that if we multiply the observed variance by I2 the value we get is I2, which is the value
presented on rows A and B. While T2 is the number we will actually employ in our computations (for
example, to assign weights or to compute a prediction interval), I2 offers a way to get a visual sense of
how the plot would change if we could somehow eliminate the error.
156
B
A
Figure 106 | Main results | Random-effects
Summary
the observed dispersion reflects differences in the true effects, while some reflects sampling error. If
we’re interested in the actual utility of the intervention, then we care about the former, not the latter.
I2 tells us what proportion of the variance reflects the former (T2) and what proportion reflects the latter
(V). It also provides a mechanism that allows us to get a sense of what the plot would look like if it was
based on the true effects rather than the observed effects. Specifically, if we construct a normal curve
that captures most of the observed effects, we could multiple that curve’s height by a factor of I. This
gives us the approximate distribution of the true effects.
Critically, I2 is a proportion of variance, not an absolute variance. An I2 near 100% tells us that most of
the observed variance is due to variation in true effect sizes, but it does not tell us that this variance is
substantial. Conversely, a low value of I2 tells us that only a small proportion of the observed variance is
due to variation in the true effect sizes, but does not tell us that this variance is trivial.
157
158
PART 9: WORKING WITH THE PLOT
You can create a regression plot with one click and then modify it extensively. We will use the BCG
analysis for this illustration.
•
•
Create the model shown in Figure 107.
Then, click Run regression
Figure 107 | Setup
159
At this point the program displays a [Scatterplot] button on the toolbar
•
Click [Scatterplot] [A]
A
Figure 108 | Main results | Random-effects
160
The program displays the screen shown in Figure 109.
Figure 109 | Plot of log risk ratio on Latitude | Random-effects
The regression line is based on the regression equation
•
•
•
The variable on the X-axis varies
Continuous variables are plotted at their means
Categorical variables are plotted based on the proportion of studies in each category. For
example, suppose that we’ve included “Hot” as a covariate, which is coded 0 for studies in cold
climates and 1 for studies in Hot climates. Suppose further than 7/13 studies are coded 1. The
mean score for Hot would be 0.54, and the regression would be plotted for studies where Hot is
0.54.
The confidence interval and prediction interval are based on the uncertainty of the coefficient of the
variable on the X-axis and the intercept—it does not depend on the uncertainty of the coefficients for
the other variables.
161
To set the variable for the X-axis
In Figure 110,
•
•
Click [Graph by] on the tool bar [A]
Or right-click on the variable name on the X-axis [B]
A
Figure 110 | Plot of log risk ratio on Latitude | Select variable for X-axis
B
162
The main screen includes four elements – the studies, regression line, confidence interval, and
prediction interval. Each can be set to show or hide independently of the others.
For purposes of this tutorial, un-check the buttons as shown in Figure 111 [A].
•
•
•
•
Studies <Off>
Regression line <Off>
Confidence interval <Off>
Prediction interval <Off>
A
Figure 111 | Plot of log risk ratio on Latitude | Blank canvas
163
Studies
Click [Studies] to display/hide the individual studies as in Figure 112.
Figure 112 | Plot of log risk ratio on Latitude | Studies
The program displays each study as a circle
•
•
•
To set the circles to be proportionate to the study weight (or not) Click Format > Studies
To edit the appearance of the circles Click Format > Studies
To modify the color of the circles Click Color > Edit colors > Studies
164
Regression line
Click [Regression line] to display/hide the regression line as in Figure 113.
Figure 113 | Plot of log risk ratio on Latitude | Regression line
The program displays the regression line. This reflects the predicted effect size (on the Y-axis) for any
given value (on the X-axis).
•
•
To edit the appearance of the regression line Click Format > Regression line
To modify the color of the regression line Click Color > Edit colors > Regression line
165
CONFIDENCE INTERVAL AND PREDICTION INTERVAL
The confidence interval and prediction interval are two very different indices. The confidence interval
reflects the precision with which we estimate the mean value, while the prediction interval reflects the
actual dispersion of effects about the mean value. The former is based on the standard error, and the
latter on the standard deviation.
For example, consider a simple meta-analysis (no covariates) where we report the mean effects size.
The confidence interval is a measure of precision. If the mean effect is 0 .5 with a confidence interval of
0.4 to 0.6, this tells us that the mean effect in this population (the population of studies from which the
sample was drawn) probably falls in the range of 0.4 to 0.6. The estimate will become more precise as
the number of studies increases, and with an infinite number of studies the confidence interval will have
a width that approaches zero.
By contrast, the prediction interval is a measure of dispersion. It does not tell us about the mean effect
but rather about the dispersion of effects about that mean. Suppose that the mean effect was 0.5 but
the true effects ranged from 0.3 to 0.7. Suppose further that we had an infinite number of studies, and
therefore knew the mean effect precisely. The confidence interval would be 0.5 to 0.5, but the
prediction interval would be 0.3 to 0.7.
In real life, of course, we don’t have an infinite number of studies and therefore we don’t know the
mean precisely. If we estimate the mean as 0.5 and we estimate that 95% of studies will fall within 0.2
on either side of the mean, then the prediction interval will take into account the plus/minus 0.2 and
add to that the uncertainty in the mean. This is not as simple as adding the width of the confidence
interval to the width of the prediction interval (we actually work with the variances) but that’s the
general idea.
These same ideas apply to meta-regression as well. For any covariate we can estimate the coefficient,
and then the confidence interval and the prediction interval for any value of the corresponding
covariate. We can then display these on the graph.
For example, suppose the regression line shows the relationship between latitude and effect size. We
pick a latitude of X and find that the predicted effect is 0.50 with a confidence interval of 0.40 to 0.60
and a prediction interval of 0.30 to 0.70. This means that in the universe of studies from which we
sampled
•
•
The mean effect for a study at latitude X is probably in the range of 0.40 to 0.60.
The effect size for any single study usually falls in the range of 0.30 to 0.70.
Note that the prediction interval only makes sense when we apply random-effect weights. When we
apply fixed-effect weights we assume that all studies at any given latitude have the same true effect
size. By definition, the deviation of true effects about the predicted effect is zero. It follows that if the CI
is 0.40 to 0.60, the PI will also be 0.40 to 0.60
166
When working with the confidence interval or the prediction interval we need to base the intervals on
one-point or simultaneous computations.
Click [Computational options] > [One point] or [Simultaneous]
•
•
•
•
One-point – In 95% of analyses, the confidence interval at any single latitude will include the
true mean effect for that latitude.
Simultaneous – In 95% of analyses, the confidence interval at all latitudes will include the true
mean effect for those latitudes.
One-point – In 95% of analyses, the prediction interval at any single latitude will include the true
effect for a study selected at random at that latitude.
Simultaneous – In 95% of analyses, the prediction interval at all latitudes will include the true
effect for a study selected at random at that latitude.
These examples assume that the confidence level has been set to 95%.
Finally, note that the CI and PI are relatively narrow at the mean of X and get wider as we depart from
the mean. This is because any error in the coefficient gets multiplied as we depart from the mean of X
in either direction.
167
Confidence interval
Click [Confidence interval] to display/hide the confidence interval as in Figure 114.
B
C
D
Figure 114 | Plot of log risk ratio on Latitude | Confidence interval
The Confidence interval is a measure of actual dispersion. It addresses the mean effect for any given
latitude.
In Figure 114,
•
•
In our sample of studies the mean effect size for a study at any given latitude is indicated by the
regression line [C].
In the universe from which we sampled, the mean effect size for a study at any given latitude
probably falls in the confidence interval [B] to [D].
Click [Computational options] > [One point] or [Simultaneous]
•
One-point – In 95% of analyses, the confidence interval at any single latitude will include the
true mean effect for that latitude.
Simultaneous – In 95% of analyses, the confidence interval at all latitudes will include the true
mean effect for those latitudes.
These examples assume that the confidence level has been set to 95%.
•
•
•
•
To set the confidence level (e.g., 90% or 95%) click [Computational options]
To set the confidence interval to be based on Z or Knapp-Hartung click [Computational options]
To edit the appearance of the confidence line Click Format > Confidence interval
To modify the color of the confidence line Click Color > Edit colors > Confidence interval
•
•
168
Prediction interval
Click [Prediction interval] to show/hide the prediction interval as in Figure 115.
A
B
C
D
E
Figure 115 | Plot of log risk ratio on Latitude | Prediction interval
Where the confidence interval is an index of precision, the prediction interval is an index of dispersion.
In Figure 115,
•
•
•
In our sample of studies the mean effect size for a study at any given latitude is indicated by the
regression line [C].
In the universe from which we sampled, the mean effect size for a study at any given latitude
probably falls in the confidence interval [B] to [D].
In the universe from which we sampled, the true effect size for a single study at any given
latitude probably falls in the prediction interval [A] to [E].
Click [Computational options] > [One point] or [Simultaneous]
•
One-point – In 95% of analyses, the prediction interval at any single latitude will include the true
effect for a study selected at random at that latitude.
Simultaneous – In 95% of analyses, the prediction interval at all latitudes will include the true
effect for a study selected at random at that latitude.
These examples assume that the confidence level has been set to 95%.
•
•
•
•
To set the confidence level (e.g., 90% or 95%) click [Computational options]
To set the prediction interval to be based on Z or Knapp-Hartung click [Computational options]
To edit the appearance of the confidence line Click Format > Prediction interval
To modify the color of the confidence line Click Color > Edit colors > Prediction interval
•
•
169
To identify specific studies
The program allows you to identify any study in the plot.
In Figure 116,
•
•
•
Click [Identify study] [A]
Click on any study [B]
The program displays the study name [C]
A
B
C
Figure 116 | Plot of log risk ratio on Latitude | Identify studies
170
Other options for customizing the graph are as follows
Appearance
Line width
Font
Font size
Format > Line width
Font
Format > Font size
Title and labels
Title
X-Axis
Y-Axis
Labels > Title
Labels > X-axis
Labels > Y-axis
Study circles
Proportionate
Line width
Format > Studies
Format > Studies
Scale for X-axis
Scale for Y-axis
Decimals
Format > X-axis
Format > Y-axis
Format > Decimals
Axes
Statistical Model
Fixed
Random
Select Fixed tab at bottom of screen
Select Random tab at bottom of screen
Predictive model
Model 1
Select desired model at bottom of screen
Export
To Word
To PowerPoint
To File
To Clipboard
Files > Export to Word
Files > Export to PowerPoint
Files > Export to File
Files > Copy to clipboard
Equation
Annotation
Comment 1
Comment 2
Show / Hide / Edit
Show / Hide / Edit
Show / Hide / Edit
Show / Hide / Edit
(The prediction equation)
(For the confidence interval and prediction interval)
(User’s optional comment)
(User’s optional comment)
Decimals
Equation in plot
X-Axis
Y-Axis
Select number
Select number
Select number
171
Modify the colors
The program maintains two color schemes. These are called Printing and PowerPoint but can actually
be used for any purpose. To switch between the schemes click
•
•
Color > Use colors for printing
Color > Use PowerPoint
After you’ve selected one scheme or the other, you can edit the color for any element on the screen. To
modify colors for the current color scheme click
•
Color > Edit colors (for current scheme)
172
Categorical variables
The plot for categorical variables works the same way as for continuous variables.
In Figure 117 we will plot by Allocation. This is a categorical variable that reflects the mechanism by
which patients were assigned to either vaccine or placebo. The group names, corresponding to the type
of allocation, are “Randomized”, “Systematic”, and “Alternate”.
In Figure 117 we define the prediction model, then click [Run regression].
Figure 117 | Regression | Setup
173
Figure 118 shows the main results.
Figure 118 | Regression | Main results | Random-effects
174
In Figure 119,
• Click [Scatterplot] [A]
• Click Graph by > Allocation [B]
B
A
Figure 119 | Regression | Plot | Categorical covariate
The plot shows one column for every category (random, alternate, and systematic). The options to
show or hide the studies, regression line, confidence interval, and prediction interval are the same as
they were for continuous covariates.
While there are only two dummy variables (Alternate and Systematic) the program automatically adds a
column for the reference category (Randomized).
175
Setting the scales
The program will automatically set the scale for the X-axis and Y-axis. This works well for any single plot.
However, but if you want to create a series of plots and ensure that these all employ with the same
scale you’ll need to set the Y-axis. Otherwise, the Y-axis may differ from one plot to the next, making it
difficult to compare plots.
To set the Y-axis manually click Format > Y-axis.
Figure 120 | Regression | Plot | Setting the scale anchors
176
PART 10: COMPUTATIONAL OPTIONS
The program allows you to set various computational options
On the regression screen click Computational Options on the menu.
Figure 121 | Regression | Set statistical options
177
KNAPP-HARTUNG VS. Z
In primary studies when we perform a significance test we have the option to use either the Z -test or
the t-test. We use the Z -test when the population variance is known, and we use the t-test when we are
using the sample variance to estimate the population variance.
The choice of a test (t vs. Z) affects the p-value in two ways.
•
•
First, the estimate of the standard error is greater with t than with z, and so the test statistic is
smaller.
Second, when we use the t-test the value required for statistical significance is larger than it is
with Z.
The difference between the two tests is most pronounced when the sample size is small. Once the
sample size passes thirty the difference between t and Z is minor, and at one-hundred the difference is
trivial.
While the choice between t and Z applies to cases where we compare two groups, the same idea applies
to cases where we compare more than two groups. Here, the choice is between the F statistic (when
the variance is estimated) and chi-squared (when the variance is known). The choices are shown in Table
5.
Table 5 – Test statistics in primary studies
Two groups
More than two groups
Variance estimated
t
F
Variance known
Z
Χ2
We are faced with a similar situation in meta-analysis. Since the variances are often being estimated
from the observed data, it would make sense to use the t distribution to test the null hypothesis and to
construct confidence intervals. In fact though, researchers have traditionally used the Z distribution for
these purposes.
In the case of a fixed-effect model this distinction turns out to have little practical impact. The only
source of error is the variance within studies, and since the n within studies (accumulated across
studies) is typically well over thirty, the difference between t and Z is negligible. Therefore, the practice
of using z has not been challenged.
However, in the case of a random-effects model, the situation is more complicated. Recall that the error
component incorporates two distinct elements – the within-study error and the between-study error.
We can justify using Z for the within-study error for the same reason that we justify that approach for
the fixed-effect model. However, the between-study variance is based on the number of studies, which
is typically small, and the difference between t and Z for this component of the variance is typically
substantial.
178
The solution proposed by Knapp and Hartung is to address each component of the variance separately.
Specifically, we would use the Z (or chi-squared) distribution for the within-study variance and the t (or
F) distribution for the between-study variance.
•
•
•
When we are estimating the mean effect size in one set of studies this approach would apply to
the test of the null.
In a subgroups analysis this would apply to the test that compares the subgroup means.
In a meta-regression it would apply to the test of each covariate and to the test of the model.
Note.
The program allows you to select either option from the statistics menu.
When you select [Z-Distribution] the program uses Z and Q (Figure 123 and Figure 124).
When you select [Knapp-Hartung] the program uses t and F (Figure 125 and Figure 126).
179
Figure 122 shows a prediction model using Allocation, Year, and Latitude as covariates.
Figure 122 | Regression | Setup
180
Click Computation options > Z-distribution (Figure 123)
Figure 123 | Set statistical options | Z-Distribution vs. Knapp-Hartung
181
Figure 124 shows the main-results screen with this option in effect.
A
B
C
D
E
F
G
Figure 124 | Main results | Z-Distribution
The screen’s title shows that the Z -distribution is being employed
•
•
•
•
•
•
•
[A] The standard errors are based on Z
[B] The confidence intervals are based on Z
[C] The test statistics and p-value for individual covariates are based on Z
[D] The test statistic and p-value for the set is based on Q
[E] The test statistic and p-value for the model are based on Q
[F] The test statistic and p-value for Goodness of fit are based on Q
[G] The test statistic and p-value for the model with only the intercept are based on Q
182
To select Knapp-Hartung click Computation options > Knapp-Hartung (Figure 125).
Figure 125 | Set statistical options | Z-Distribution vs. Knapp-Hartung
183
Figure 126 shows the main results with this option in effect.
A
B
C
D
E
F
G
Figure 126 | Main results | Knapp-Hartung
The screen’s title shows that Knapp-Hartung (KH) is being employed
•
•
•
•
•
•
•
[A] The standard errors are based on t
[B] The confidence intervals are based on t
[C] The test statistics and p-values for individual covariates are based on t
[D] The test statistic and p-value for the set are based on F
[E] The test statistic and p-value for the model are based on F
[F] The test statistic and p-value for Goodness of fit are based on Q, and not F. This is because
this tests the null that T2 is zero. Since the KH adjustment only affects the T2 part of the
variance, when T2 is zero the adjustment is not applied.
[G] Statistics for the model with only the intercept are based on Q, and not F. This is because
this tests the null that T2 is zero. Since the KH adjustment only affects the T2 part of the
variance, when T2 is zero the adjustment is not applied.
In addition to using the t-distribution or F-distribution for the critical values, the standard error is
184
Compare Figure 124 which is based on Z, with Figure 126 which is based on Knapp-Hartung.
Table of coefficients
When we move from a Z-score to Knapp Hartung
•
•
•
•
•
•
•
The coefficients do not change
The standard error increases [A]
The confidence interval width increases [B]
The Z-score is replaced by a (smaller) t-score [C]
The p-value becomes larger (less significant) [C]
The Q-value for a set is replaced by a (smaller) F-score [D]
The p-value for a set becomes less significant [D]
Test of the model
When we move from a Z-score to Knapp Hartung
•
•
The Q-value is replaced by a (smaller) F-value
The p-value becomes less significant
Goodness of fit
When we move from a Z-score to Knapp Hartung [F]
•
The numbers do not change. This is because the Knapp-Hartung adjustment only applies to the
T2 part of the variance, but the goodness of fit test is computed assuming T2is zero.
Comparison of Model 1 with the null model
When we move from a Z-score to Knapp Hartung [G]
•
The numbers do not change. This is because this comparison employs weights based on withinstudy variance (V), and the Knapp Hartung adjustment only affects between-study variance (T2).
Notes
While it is always true that the p-value will be the same or higher (further from zero) for Knapp-Hartung
(KH), the extent of the difference depends on the amount of between-study variance and the number of
studies. To the extent that the between-study population variance is small and/or the number of
studies is large, the between-study error variance will be small, and the difference between the Z option
and the KH option will tend to be relatively small. Conversely, to the extent that the between-study
population variance is large and/or the number of studies is low, the difference between the two
options will tend to be relatively large.
185
You do not need to return to the [Modify models] screen to switch between Z and Knapp-Hartung.
Rather, if you’re already looking at the results you can simply change the setting and the results will
change.
Figure 124 and Figure 126 showed the impact of this option for the main screen, but the impact actually
affects all screens that display confidence intervals and/or tests of significance. The option also affects
the plots, since the confidence interval and prediction interval depend on the standard error and the
statistical distribution.
Since the Knapp-Hartung option is intended to address uncertainty in between-studies variance, most
people who use it do so only for the random-effects model. In CMA, the Knapp-Hartung option is only
available for random-effects models.
While these adjustments can be applied to any use of the random-effects model (that is, for a single
group of studies, for a subgroup analysis, and for meta-regression), to date we have only implemented
them for the meta-regression. We plan to update the other modules in the future.
The intent of the Knapp-Hartung adjustment is to improve the accuracy of p-values, confidence
intervals, and prediction intervals. Higgins and Thompson (2004) proposed an approach that bypasses
the sampling distributions and instead employs a permutation test to yield a p-value. Using this
approach we would compute the Z-score corresponding to the observed covariate. Then, we would
randomly redistribute the covariates among studies and see what proportion of these re-distributions
yield a Z-score exceeding the one that we had obtained. This proportion may be viewed as an exact pvalue. This option is not implemented in CMA.
186
ONE-POINT OR SIMULTANEOUS CONFIDENCE INTERVALS FOR GRAPH
When you plot the regression line of effect size on a covariate in a meta-regression, the program allows
you to plot the confidence interval. The confidence interval reflects the uncertainty in the predicted
value (the height of the regression line) being plotted. For example, if we plot the regression line for
effect size on latitude, the confidence interval reflects the uncertainty in the predicted value of effect
size for each value of latitude, but it treats all other covariates as fixed at their mean.
There are two useful options for plotting the interval. We can plot an interval that is accurate for any
single point on the graph, or an interval that is accurate for all points on the graph simultaneously. To
select either option
Click Computation options > Simultaneous/One-point
•
Accurate for one point means that if we were to select any one point on the regression line at
random, in 95% of all possible regressions, the true predicted value for that point would fall
within the confidence interval displayed at that point.
•
Accurate for all points means that if we were to look at all points on the regression line, in 95%
of all possible regressions, the true predicted value for all the points would fall within the
confidence interval displayed at that point. Note that this includes predictions for all possible
values of latitude, not only those that happen to appear in the data set.
Obviously the second criterion is stricter (we want to make an inference about the predicted value for
all values of latitude rather than one) and therefore, the confidence interval will need to be wider. We
do this by using a multiplier based on the Sheffé adjustment.
The difference between the two can be seen by comparing Figure 127 (one-point) with Figure 128
(simultaneous). The regression line is identical in the two, but the confidence interval is wider in the
second.
This is true for all points on the plot, but is most evident toward either end of the regression line, since
uncertainty in the coefficient becomes more evident as we depart from the mean of the predictor. For
example, compare the width of the one-point confidence interval in Figure 127 [A] versus the
simultaneous confidence interval in Figure 128 [B].
To facilitate this comparison we used Format > Y-Axis to set the same scale for both plots.
Click Comments > Show annotation to include the details on the plot, as shown in the bottom right-hand
corner of each figure.
187
A
Figure 127 | Set statistical options | One-point confidence intervals
B
Figure 128 | Set statistical options | Simultaneous confidence intervals
188
OPTIONS FOR ESTIMATING Τ2 (MM, ML, REML)
When we select random-effects the program needs to estimate the value of tau-squared (τ2), the true
between-studies variance. (We use the Greek symbol τ2 to represent the true value, and T2 to represent
the sample estimate of that value).
To select a method click [Computational options] as shown in Figure 129.
Figure 129 | Set statistical options | Estimating T2
There are three approaches commonly used to partition the variance and estimate τ2. These are
•
•
•
Method of moments (MM), also known as the DerSimonian and Laird method)
Unrestricted maximum likelihood (ML) also known as maximum likelihood
Restricted maximum likelihood (REML)
If we are not willing to assume that the effect sizes are normally distributed, MM is often the method of
choice. The method of moments does not depend on any assumptions about the distribution of the
random effects, so it has a robustness characteristic that the two other methods (which involve the
assumption that the random effects have a normal distribution) do not have. If we are willing to assume
a normal distribution of effects, then statisticians tend to prefer ML or REML, which are more efficient
than MM (the estimates have smaller variance).
Between MM and REML, ML tends to yield a more precise estimate of T2 (but with a bias) while REML
tends to yield a less biased estimate (but with less precision). With small numbers of studies imprecision
can be more important than bias, and so some prefer ML. With more studies, the balance may shift in
favor of REML.
189
Note.
If you have already run the analysis and want to modify the statistical option, you do not need to return
to [Modify models] and re-run the analysis. Simply click [Computational options] and make a selection.
All three options for estimating τ2 can be used for a basic analysis as well as for regression. However, the
basic analysis module in CMA offers the MM option only. (We plan to add other options in the future).
190
ONE-SIDED VS. TWO-SIDED TESTS
Many statistical tests allow the option of one-sided or two-sided.
•
•
Two-sided tests are appropriate when an effect in either direction would be meaningful.
One-sided tests are appropriate when we only need to identify an effect in one direction, and an
effect in the other direction would have the same implications as zero effect.
In the overwhelming majority of social-science and medical research, while we may (indeed almost
always do) expect the effect to fall in a specific direction, an effect that was statistically significant in the
other direction would still be important. For example, if we expected the treatment to improve survival
but it turned out to hurt survival, this would be critically important information. However, if the test had
been performed as one-tailed then an effect in the reverse direction (that the treatment is harmful)
cannot be statistically significant by definition, even if the computed p-value is < 0.0001. Therefore,
except in rare instances, the two tailed test is appropriate.
In the event that you select a one-tailed test, note that this applies only to the p-values for individual
covariates on the main results screen.
•
•
•
•
•
It does not affect the confidence interval since this is displayed for lower and upper limits.
It does not affect the p-value for the test of the model. Since this is based on Q (or F) no
direction can be specified and it must be two-tailed.
It does not affect the p-value for a set of covariates. Since this is based on Q (or F) no direction
can be specified and it must be two-tailed. (For consistency, this applies even if the set incudes
only one covariate.)
It does not affect the p-value for a test of the increment. Since this is based on Q (or F) no
direction can be specified and it must be two-tailed.
It does not affect the confidence interval nor the prediction interval on the plot. Since these are
shown for both the lower and upper limit, they are displayed using multipliers for a two-tailed
test.
191
PART 11: CATEGORICAL COVARIATES
Categorical covariates are covariates that represent a category or group, rather than a numerical score.
For example, the covariate “Allocation” reflects the mechanism employed to assign patients to either
vaccine or placebo. Each study is coded as “Randomized”, “Systematic”, or “Alternate”.
When we perform a subgroups analysis (as with an analysis of variance in a primary study) we can work
directly with categorical covariate and classify each study by its allocation method (e.g., “Systematic”).
However, this not possible when we perform a regression, since regression requires that we work with
numbers, not labels. Therefore, rather than working with the original variable we create so-called
“dummy variables”, numeric variables that stand for a group or category.
In this chapter we will discuss how to create and interpret these dummy variables. As always, we
assume that the reader is familiar with the use of dummy variables in primary regression, and our intent
is to show how the same rules apply for meta-regression.
Note. The mechanism for working with dummy variables in a regression depends on whether or not we
include the intercept in the regression equation. In this chapter we assume that we will include the
intercept. The alternative is discussed in [Part 12: When does it make sense to omit the intercept].
Overview
For a categorical variable with m groups, we need to create m − 1 dummy variables. Since Allocation has
three groups (Randomized, Systematic, Alternate) we need to create two dummy variables.
We need to select any one of the three groups to serve as the “Reference” group. Then, we create a
dummy variable for each of the other two groups (but not for the reference group). With three groups,
we have three options −
A. We can select Randomized to serve as the reference group. The dummy variables will be
Systematic and Alternate.
B. We can select Systematic to serve as the reference group. The dummy variables will be
Randomized and Alternate.
C. We can select Alternate to serve as the reference group. The dummy variables will be
Systematic and Randomized.
In this chapter we discuss −
•
•
•
How to create the dummy variables
How to use these in the regression
How to select the reference group
192
Dummy variables
CMA is able to create the dummy variables automatically. We can use the Allocation variable as an
example. In Figure 130,
•
•
•
•
A
Click on Show Covariates [A]
Click on Allocation [B]
Click on Edit reference group and select [Random] [C]
Click [Add to main screen] [D]
B
C
D
Figure 130 | Creating dummy variables
193
Since we’ve set “Random” as the reference group, the Dummy variables are “Alternate” and
“Systematic”. In Figure 131 [E] the program creates these and adds them to the variable list.
E
Figure 131 | Creating dummy variables
F
As always, tick the boxes [F] to include these variables in the current predictive model. Tick either of the
two boxes, and the other will be ticked automatically. This is because the two represent allocation, and
not because they belong to the same set.
The two dummy variables are Alternate and Systematic. Following the conventions proposed by Cohen,
studies are coded “1” if they belong to the dummy-variable’s group name. Thus,
•
•
A study is coded 1 for “Alternate” if it employed alternate allocation, or 0 otherwise.
A study is coded 1 for “Systematic” if it employed systematic allocation, or 0 otherwise.
Therefore, the first three studies, shown as Figure 132 [G], should be coded as follows.
•
•
•
Frimodt-Moller et al (Alternate) should be coded 1 for alternate and 0 for systematic
TB Prevention trial (Random) should be coded 0 for alternate and 0 for systematic
Comstock et al 1974 (Systematic) should be coded 0 for alternate and 1 for systematic
194
G
Figure 132 | Categorical variables
If you’d like to see the actual codes assigned by the program, proceed as follows.
•
•
Click Run Regression
Click More results > All studies [H]
H
I
Figure 133 | Creating dummy variables
Figure 133 [I] shows that the codes have been assigned as expected.
•
•
•
Frimodt-Moller et al (Alternate) is coded 1 for alternate and 0 for systematic
TB Prevention trial (Random) is coded 0 for alternate and 0 for systematic
Comstock et al 1974 (Systematic) is coded 0 for alternate and 1 for systematic
195
The selection of a reference group has no impact on the model
To this point we’ve introduced the idea of a reference group, and shown that the selection of a
reference group determines which dummy variables will be created.
Critically, while the selection of a reference group can be important for some aspects of the analysis (as
discussed below), it has no impact on the statistics for the model.
To emphasize this point, we present three versions of the regression.
•
•
•
In Figure 134 the reference group is “Random”, the dummy variables [A] are Alternate and
Systematic
In Figure 135 the reference group is “Systematic”, the dummy variables [A] are Alternate and
Random
In Figure 136 the reference group is “Alternate”, the dummy variables [A] are Random and
Systematic
196
A
D
B
C
Figure 134 | Dummy variables | Allocation with “Randomized” as the reference group
197
A
D
B
C
Figure 135 | Dummy variables | Allocation with “Systematic” as the reference group
198
A
D
B
C
Figure 136 | Dummy variables | Allocation with “Alternate” as the reference group
199
In all three versions of the analysis, statistics for the model are exactly the same
•
•
•
•
•
The test of the model [B] yields a Q-value of 1.4349 with 2 df and p = 0.4880.
The goodness of fit test [B] yields a Q-value of 132.3676 with 10 df and p < 0.0001.
The estimates of T2 and T are 0.5596 and 0.7480 [B], respectively.
The estimate of I2 with covariates in the model is 92.45% [B]
The estimate of R2 is 0.00% [C]
The reason that the statistics for the model are the same regardless of which group serves as the
reference group (and which two dummy variables are included in the regression) is that all three
versions incorporate precisely the same information. Concretely, once we know a study’s code on any
two dummy variables, we know precisely which allocation method was employed for that study. It
follows that (at least for purposes of the model) it doesn’t matter which two dummy variables we used.
Indeed, it must be this way. The set of dummy variables addresses the question “Is allocation related to
effect size” and it must be true that we will get the same answer regardless of which mechanism we
employ to represent allocation in the regression.
Working with the “Set”
When the program creates a series of dummy variables to represent a categorical covariate, it
automatically defines dummy variables these as a “Set”. In Figure 134, Figure 135, and Figure 136 the
program has added a column labeled “Set”.
•
•
•
In this column we see the label “Allocation”, which refers to the categorical variable.
Brackets indicate the two dummy variables that represent allocation.
Each of these variables has a two-part name, with the first part reflecting the core variable
(Allocation) and the second part indicating the dummy-variable (Alternate).
In our example (where there are three groups) the set includes two covariates [A].
•
•
•
The line for “Alternate” (if it exists) addresses the impact of Alternate allocation vs other types
of allocation.
The line for “Systematic” (if it exists) addresses the impact of Systematic allocation vs other
types of allocation.
The line for “Random” (if it exists) addresses the impact of Randomized allocation vs other types
of allocation.
Thus, each of these lines addresses the impact of a specific allocation type.
By contrast, the “Set” addresses the impact of Allocation in general. This is an omnibus test that asks if
there are any differences in effect size among allocation types. These statistics are shown at the righthand side of the display [D].
200
In our example Allocation (in the form of dummy variables) is the only covariate in the equation, and so
the test of the set is identical to the test of the model.
•
•
The test of the set [D] yields Q-value of 1.4349 with 2 df and p=0.4880.
Similarly, the test of the model [B] yields a Q-value of 1.4349 with 2 df and p=0.4880.
Therefore, in this example we really didn’t need to present statistics for the set. We could have simply
relied on the statistics for the model.
However, this equality only exists when the variables in the set are the only variables in the model.
Typically, this is not the case. Rather, there will often be additional covariates in the model, and when
that’s true, the test of the set is quite different from the test of the model.
For example, suppose the model includes Allocation (dummy coded) and also latitude. The Set would
test the impact of allocation with latitude partialled. By contrast, the model would test the combined
(and simultaneous) impact of allocation and latitude. These are two entirely different issues.
How to select a reference group
To this point we’ve shown that the selecting one group rather than another to serve as the reference
group has no impact on the statistics for the full model nor the statistics for the set.
Nevertheless, the selection of a reference group does have implications what lines are displayed within
the set, and how these lines are interpreted.
Consider the case where we use Random allocation as the reference group (Figure 134). In this case the
dummy variables are “Alternate” and “Systematic”.
The predicted effect size for a study is going to be
Y=
B0 + B1 ( Alternate) + B2 ( Systematic)
(1.31)
Studies that employed Random allocation will have a code of 0 on both covariates, and so the predicted
value is
or simply
Y=
B0 + B1 (0) + B2 (0) ,
(1.32)
Y = B0
(1.33)
In other words, for studies in the reference group, the predicted value is simply the intercept.
For a study in either of the other groups, the predicted value is the intercept plus the coefficient for that
group. Thus, the coefficient for Alternate gives us the difference between the predicted effect size in
that group and the predicted effect size in the reference group.
201
If we work with Figure 134, where the reference group is Random allocation,
•
•
•
Random allocation – predicted effect size is the intercept (−0.9905), or −0.9905
Alternate allocation – predicted effect size is 0.4780 units above the intercept, or −0.5125
Systematic allocation – predicted effect size is 0.5822 units above the intercept, or −0.4083
If we work with Figure 135, where the reference group is systematic allocation
•
•
•
Systematic allocation – predicted effect size is the intercept (−0.4083), or −0.4083
Alternate allocation – predicted effect size is 0.1042 units below the intercept, or −0.5125
Random allocation – predicted effect size is 0.5822 units below the intercept, or −0.9905
If we work with Figure 136, where the reference group is alternate allocation
•
•
•
Alternate allocation – predicted effect size is the intercept (−0.5125), or −0.5125
Random allocation – predicted effect size is 0.4780 units below the intercept, or −0.9905
Systematic allocation – predicted effect size is 0.1042 units above the intercept, or −0.4083
Thus, the predicted value for each group is the same regardless of which group serves as the reference
group. The difference is that each version presents a different set of comparisons.
Note.
The standard error for the reference group is the standard error of that group’s mean effect. By
contrast, the standard error for the other groups is the standard error of the difference between that
group and the reference group.
Similarly, the p-value for the reference group tests the null that the mean effect size in the reference
group is zero. By contrast, the p-value for the other groups tests the null that the difference between
that group’s mean effect and the reference group’s mean effect is zero.
202
Finally, to show the correspondence between this analysis and a subgroups analysis, we show a subgroups analysis (
Figure 137) where we’ve grouped by allocation type.
A
Figure 137 | Subgroups | Allocation type
•
•
B
The mean effect size for each subgroup [A] is the same as the numbers from the regression
The Q-value for the model [B] is the same as the numbers from the regression
In this example, Allocation (or rather the dummy variables that represent allocation) is the only
covariate in the regression. Therefore, the intercept is simply the mean effect for the reference group
and the coefficients represent the difference in mean effects. If the regression model included other
covariates then all the statistics would be adjusted for the other covariates.
Note. For the subgroups analysis, Computational options > Random and mixed-effect options must be
set to pool estimates of T2.
Creating dummy variables manually
Above, we showed how to create dummy variables automatically. You also have the option of creating
dummy variables manually. On the data-entry sheet create a column for a moderator and then
(critically) define the moderator type as integer or decimal. Then, enter a value for each study.
While the option to create dummy variables automatically works well in most cases, there are several
cases where you’ll need to use the manual option.
Interactions
203
Suppose you have a categorical variable that is represented by a dummy variable A, and you want to
assess the impact of that variable and also its interaction with another variable B. You’ll need to work
with the variables A, B, and AB. In this case it will be easier to create A (and then AB) manually.
Alternate coding schemes
Dummy-coding is only one of the options possible for categorical covariates. Texts on multiple
regression discuss other options, such as effects-coding and contrast-coding. You can use any of these
coding schemes, but you’ll need to create the dummy variables manually.
Regressions with no intercept
The automatic coding scheme is only available when you include the intercept in the equation. When
you omit the intercept (see next chapter) the coding scheme changes (in that case, for m groups you
need m rather than m – 1 dummy variables) and the automatic function is not available.
If you create the dummy variables manually, you’ll also need to define these as a set manually. This is
discussed in Part 13: Working with “Sets” of covariates.
204
PART 12: WHEN DOES IT MAKE SENSE TO OMIT THE
INTERCEPT
In any primary regression or meta-regression we have the option to either include or exclude the
intercept from the prediction equation. The decision to omit the intercept usually arises when we are
working with a categorical variable, and we will discuss the issue in this context. This decision
fundamentally affects the issues addressed by the analysis, as follows.
When we include the intercept in a regression with categorical covariates
•
•
•
Coefficients reflect differences in effect size across categories.
Tests of a covariate address the question “Is the covariate related to effect size?”
The model tests the null hypothesis that no covariate is related to effect size.
When we omit the intercept
•
•
•
Coefficients reflect the absolute effect sizes within categories.
Tests of a covariate address the question “Is the effect size zero?” in this category.
The model tests the null hypothesis that all groups have a mean of zero.
In all chapters up to this point we have assumed that we are including the intercept, which is typically
the case. However, there are cases when we may want to omit the intercept.
•
First, we will address a technical issue about coding categorical variables. As explained below,
the decision to include or omit the intercept impacts the way that we code categorical variables
for the analysis.
•
Second, we will show how to interpret an analysis where the intercept is omitted, and how this
differs from an analysis where the intercept is included.
As always, we caution the reader that this chapter is not a comprehensive treatment of the topic. We
assume that the reader is familiar with these issues from multiple regression in primary studies. Our
goal here is to review the key concepts, and show how they can be applied in meta-analysis.
The example
The initial data set includes a covariate called latitude. For purposes of this example we need a
categorical covariate, so we will create a new variable called Climate, which is coded Hot or Cold
(latitude 33 and under, vs. 34 and higher).
Since Climate is categorical, it cannot be inserted directly into the analysis. Rather, we need to create a
numerical covariate corresponding to climate and use this in the analysis. We may use any of several
schemes for this purpose (dummy coding, effects coding, contrast coding) but what all of these schemes
have in common in the usual approach (including an intercept) is that we need m − 1 covariates to
represent a covariate with m groups. In the current example there are two groups (Cold and Hot) so we
need one covariate.
205
If we elect to use dummy coding, we can create a covariate called Hot and code it 0 for Cold studies and
1 for Hot studies. Alternatively we can create a covariate called Cold and code it 1 for Cold studies and 0
for Hot studies. We could use either of these covariates in the analysis but we could not use both since
one of them will be redundant.
This rule, that we need m − 1 covariates for a categorical variable with m groups, only applies when we
include the intercept in the regression. By contrast, when we omit the intercept we actually need m
covariates. In the current example we would include two covariates (Hot and Cold) in the analysis.
CMA includes a mechanism to create dummy variables automatically, but this mechanism (since it
creates m − 1 covariates) is intended only for cases where we include the intercept. For simplicity in this
chapter we will create all dummy variables manually, whether or not we include the intercept in any
given analysis.
Figure 138 shows the two dummy variables
•
•
“Hot” is coded 1 if the study was located in a hot climate and 0 otherwise [A]
“Cold” is coded 1 if the study was located in a cold climate, and 0 otherwise [B]
Thus, a “1” indicates the presence of the attribute (Hot or Cold) while a “0” indicates the absence of that
attribute.
A
B
Figure 138 | Data-entry | Dummy variables for Hot and Cold
If we include the intercept we will include either Hot or Cold in the prediction equation. If we omit the
intercept we will include both Hot and Cold in the prediction equation.
Note. This becomes more complicated when we have two or more categorical variables, such as
Climate and Allocation. This is beyond the scope of this manual.
206
Interpreting the results
A meta-regression without the intercept will give us the mean effect for each group (Hot and Cold).
Before proceeding to the regression, we can use a subgroups analysis to see what these means actually
are.
On the main analysis screen click Computational options > Mixed and random effects options and then
select the option to “pool within-group estimates of tau-squared” as shown in Figure 139. Then click
Computational options > Group by and group by Climate. Select the “Random” tab at the bottom of the
screen.
Figure 139 | Basic analysis | Computing T2 in the presence of subgroups
Figure 140 | Basic analysis | Subgroups Cold vs. Hot
207
C
D
E
Figure 141 | Basic analysis | Subgroups Cold vs. Hot
Figure 140 and Figure 141 show the results of this analysis
•
For the Cold subgroup [C] the mean effect size is −1.1987 with SE = 0.1769. The test addresses
the question “Is this effect size zero?” and yields Z = -6.7740, p < 0.0001.
•
For the Hot subgroup [D] the mean effect size is −0.2784 with SE = 0.1522. The test addresses
the question “Is this effect size zero?” and yields Z = −1.8289, p < 0.0674.
•
The test of the between-subgroups variance [E] addresses the question “Does the mean effect
size differ by subgroup?” and yields Q = 15.5445, df = 1, p = 0.0001.
To compute the mean for each subgroup in meta-regression we would omit the intercept and include
both Cold and Hot as covariates (note that the box for intercept is unchecked).
Figure 142 | Regression | Setup | No intercept
208
CC
DD
Figure 143 | Regression | Main results | No intercept
Because we have omitted the intercept, the statistics reflect the mean effect size for each group.
•
For the Cold subgroup [CC] the mean effect size is −1.1987 with SE = 0.1769. The test addresses
the question “Is this effect size zero?” and yields Z = -6.7740, p < 0.0001.
•
For the Hot subgroup [DD] the mean effect size is −0.2784 with SE = 0.1522. The test addresses
the question “Is this effect size zero?” and yields Z = −1.8289, p < 0.0674.
Note that these numbers are exactly the same as the numbers we saw in the subgroups analysis, Figure
141). Line CC corresponds to line C, and line DD corresponds to line DD.
Finally, in Figure 143, since there is no intercept, the test of the model addresses the null hypothesis
that the mean for all groups is zero. The Q-value of 49.2323 with 2 degrees of freedom yields a p-value
209
of < 0.0001. We reject the null and conclude that the mean effect size is probably not zero in at least
one of the groups. (The subgroups analysis does not include this test).
210
PART 13: WORKING WITH “SETS” OF COVARIATES
DEFINING A “SET”
In regression there are times when we use several covariates to capture a concept. For example
•
•
•
•
•
If we have a categorical covariate with m values, we use m – 1 covariates to represent this
variable in the analysis (as discussed in the preceding chapter).
If we want to assess the relationship between duration of treatment and effect we might
include duration, duration2, and duration3 as predictors.
We may have a series of covariates, such as income and education that (together) represent the
impact of socio-economic status.
We may have a series of covariates such as dose and duration that (together) represent the
intensity of a treatment.
We may have two covariates and also the interaction between, where the three together
represent their influence on outcome.
When we define covariates as a Set, the program reports a test of significance for the Set with all other
covariates held constant. For example, consider the analysis displayed in Figure 144. The two
covariates dummy variables that (as a set) capture the Allocation method.
Note that in
Figure 137 there is now a column labeled “Set” [C]. Under this column there is a set called “Allocation”
and the program has inserted brackets to show which covariates are included in the set.
211
C
A
B
Figure 144 | Regression | Main results | Assessing the impact of a set
•
•
•
The statistics for Allocation: Alternate tells us if the use of alternate allocation is related to effect
size (when all other covariates are partialled).
The statistics for Allocation: Systematic tells us if the use of systematic allocation is related to
effect size (when all other covariates are partialled).
The statistics for the set [A] tell us if Allocation as a whole (that is, the use of Random,
Systematic, or Alternate allocation) is related to effect size.
When the covariates in the set are the only covariates in the prediction equation (as they are in this
example) the statistics for the set will be identical to the statistics for the model. Specifically, the Qvalue for the set [A] and the Q-value for the model [B] are both 1.4349. Therefore, in this example
(Figure 144) we could have simply employed the test of the model as the test of Allocation.
However, that is not the case when the model includes additional covariates. Consider Figure 145,
where the model includes Latitude and Year as well as the two dummy-variables for Allocation. In this
case, if we want to know the impact of Allocation with Latitude and Year partialled, we need to use the
statistics for the Allocation set, Q = 1.5492, df = 2, p = 0.4609 [A]. We could not use the statistics for the
model [B] to test the impact of allocation, since the test of the model is based on the impact of all four
covariates.
212
A
B
Figure 145 | Main results | Assessing the impact of a set
213
HOW TO CREATE A SET
When we use a Set to represent a categorical variable, CMA creates the dummy variables automatically
and links them (with the bracket) automatically.
In all other cases we need to link the covariates manually. For this example we’ll work with Latitude-C
and Latitude-C2 (latitude centered, and its square). We want to create a set that incorporates these two
covariates and call the set “Latitude Set”.
In Figure 146
•
•
•
•
Move Latitude-C into the model
Move Latitude-C2 into the model
Ensure that the covariates intended for the set are sequential in the list
Highlight these covariates by pressing {SHIFT} and clicking on the covariate names [B]
B
Figure 146 | Setup | Defining a set of covariates
214
In Figure 147
•
•
Enter the name Latitude Set and click [Ok] [D]
C
D
Figure 147 | Setup | Naming a set of covariates
The program has now created a set called “Latitude-C Set” [E] which includes the two covariates (Figure
148). When you run the regression the program will display statistics for this set (Figure 149).
E
Figure 148 | Setup | Naming a set of covariates
215
Figure 149 | Regression | Main results | Working with a set of covariates
HOW TO REMOVE A SET
•
•
Highlight the set’s name [F]
F
G
Figure 150 | Main results | Removing a set of covariates
216
PART 14: INTERACTIONS AND CURVILINEAR RELATIONSHIPS
Suppose we run a regression with two covariates, X1 and X2. Consider what happens in two cases —
•
•
When there is not an interaction
When there is an interaction
There is no interaction if the impact of X1 on the effect size is constant for all values of X2 (and vice
versa). There is an interaction between two variables if the impact of one variable depends in the
magnitude of the second variable (and vice versa).
When there is not an interaction
When there is no interaction we include two covariates in the model – X1 and X2.
a) B1 gives us the main effect for variable X1, for any value of X2
b) B2 gives us the main effect for variable X2, for any value of X1
When we run a regression with covariates X1 and X2 (but not the interaction), the following is true.
•
•
We assume that the impact of X1 is the same for all values of X2. The p-value for X1 is a test of this
constant effect.
We assume that the impact of X2 is the same for all values of X1. The p-value for X2 is a test of
this constant effect.
When there is an interaction
By contrast, consider what happens when there is an interaction of X1 and X2. In this case we create a
new variable X3 (defined as X1 times X2) and enter all three variables into the prediction equation.
a) B1 gives us the first-order effect for variable X1 when X2 is zero
b) B2 gives us the first-order effect for variable X2 when X1 is zero
c) B3 gives us the impact of the interaction (over and above the first-order effects)
When we run a regression with covariates X1 and X2 and the interaction X3, the following is true
•
We assume that the impact of X1 depends on the value of X2, and so we assess the impact of X1
at a specific value of X2, zero. The p-value is a test of the relationship between X1 and the effect
size at this specific value of X2.
•
We assume that the impact of X2 depends on the value of X1, and so we assess the impact of X2
at a specific value of X1, zero. The p-value is a test of the relationship between X2 and the effect
size at this specific value of X1.
217
Centering
To center a variable means to re-scale the variable to have a mean of zero. If the original studies took
place in the years 1930, 1935, 1940, 1945, 1950, we could subtract 1940 from each value to yield scores
of -10, -5, 0, 5, and 10. If the original covariate is Year, the new one could be called Year-C.
When we include X1 and X2 (but not the interaction, X3) in the equation, the decision to center (or not)
has no impact on the p-value for the individual covariates.
By contrast, if we include X1, X2, and the interaction X3 in the equation, then the decision to center (or
not) will have a substantial impact on the p-values for the individual covariates. This is because we test
X1 for the case where X2 is zero (and vice-versa).
For example, suppose that X1 and X2 are Latitude and Year.
•
Consider the statistics for Year. If we don’t center Latitude, then we test the impact of Year when
Latitude is 0. If we do center, then we test the impact of year when Latitude-C is 0, and (it follows)
Latitude is around 33.
•
Consider the statistics for Latitude. If we don’t center Year, then we test the impact of Latitude
when Year is 0. If we do center, then we test the impact of Latitude when Year-C is 0 and the actual
year is 1948.
Centering is also important if we want to assess the impact of curvilinear relationships. For example,
suppose that we want to see if the relationship between Latitude and effect size has a curvilinear
component. For this purpose we need to enter both Latitude and Latitude2 as covariates.
If we don’t center Latitude, then these two covariates will be highly correlated with each other, and it
will be difficult to disentangle the linear from the curvilinear components. By contrast, if we center
latitude and square the centered value, the correlation between the two will be low, and we will be able
to identify the unique impact of each.
For these reasons, in the examples that include an interaction (or a curvilinear term) we use variables
have been centered about their mean (for continuous variables) or dummy coded (as explained below)
for categorical variables, so that zero is a meaningful value or category.
The program will not automatically create centered variables, nor variables for the interaction. While it
is possible to enter these variables manually, it’s usually easier to copy the original variables to Excel™,
create the new variables, and then copy these back into CMA.
Important note.
As always, we assume that the reader who plans to work with interactions has a good understanding of
these from primary regression, and focus here on the elements that are specific to meta-analysis and to
the use of this program. Similarly, the very brief overview of centering does not fully address the
implications of scaling or centering, or other issues which may affect the results.
The same rules apply for interactions involving categorical variables, continuous variables, or
combinations of the two types. For clarity, we present an example for each of three cases. These are
218
•
•
•
The interaction of two categorical covariates
The interaction of a categorical covariate with a continuous covariate
The interaction of two continuous covariates
219
INTERACTION OF TWO CATEGORICAL COVARIATES
The original data set includes the covariates Latitude and Year, both of which are continuous. For
purposes of this discussion we need two categorical covariates, and we create them by dichotomizing
Latitude and Year (see Appendix 5: Creating variables for interactions).
•
•
•
Hot is coded 1 if the latitude is 34 or less, and is coded 0 if the latitude is exceeds 34.
Recent is coded 1 if the Year is 1945 or later, and is coded 0 if the Year is earlier than 1945.
Hot x Recent is created by multiplying Hot x Recent.
Figure 151 shows the model, Figure 152 shows the main results, which are plotted in Figure 153 and
Figure 154. While the program can compute the statistics for interactions it cannot plot these
interactions, and therefore these plots were created in Excel™ (see appendix).
Figure 151 | Setup | Interaction of two categorical covariates
220
A
B
C
D
E
F
Figure 152 | Main results | Interaction of two categorical covariates
221
We can display these results as shown in Table 6.
Table 6
Climate
Cold
Hot
Time
Early
−1.1154
−0.2164
Recent
−1.4416
−0.3035
Instructions for creating this table are given in the appendix, but the computation is quite intuitive.
Since the covariate values for all cells are a combination of 0s and 1s, the expected effect size in each
cell is given by the sum of the relevant coefficients from Figure 152.
•
•
•
•
Upper-left cell is the Intercept (−1.1154) = −1.1154
Upper-right cell is the Intercept plus Recent (−1.1154 −0.3261) = −1.4416
Lower-left cell is the Intercept plus Hot. (−1.1154 + 0.8990) = −0.2164
Lower-right cell is the Intercept plus Recent, Hot, and the Interaction. (−1.1154 −0.3261 −.8990 +
0.2391) = −0.3035
Once we have this table, it’s a simple matter to plot the main effects and their interactions as shown in
the following plots.
222
Hot
Does the vaccine’s effect differ as a function of climate?
Since the equation includes the interaction term, the impact of climate is not a main effect. Rather, it is
a first-order effect that may differ for Early studies vs. Recent studies.
In Figure 153,
The impact of climate for Early studies (Recent = 0) is indicated by the arrow labeled [B}.
The impact of climate for Recent studies (Recent = 1) is indicated by the arrow labeled [BB].
The impact of climate is tested for the Early studies [B] since these are the studies coded 0 for Recent.
The p-value for this difference is 0.0184 as shown in Figure 152 [B].
B
BB
Figure 153 | Plot | Interaction of two categorical covariates
223
Recent (vs. Early)
Did the impact of the vaccine change from the Early studies to the Recent studies?
Since the equation includes the interaction term, the impact of Recent may differ for Cold studies vs.
Hot studies. In Figure 154,
The impact of Recent for Cold studies (Hot = 0) is indicated by the arrow labeled [B].
The impact of Recent for Hot studies (Hot = 1) is indicated by the arrow labeled [BB].
The impact of Recent is tested for the Cold studies [B] since these are the studies coded 0 for the
covariate (Hot). For these studies the regression line increases slightly as we move from Early to Recent
studies. However, the corresponding p-value for Recent is 0.4311 as shown in Figure 152 [C]. Thus,
there is no evidence that effect size is related to Recent.
BB
B
Figure 154 | Plot | Interaction of two categorical covariates
224
Hot x Recent
Does the relationship between Time and effect size vary by Climate? In Figure 155 the regression line
for Time in Hot climates is not strictly parallel to the regression line for Time in Cold climates. However,
the differences in slopes are minor.
Does the relationship between Climate and effect size vary by Time? The impact of Climate in the
Recent studies is only slightly larger than the impact of Climate in the Early studies (as indicated by the
difference in the height of the two arrows in Figure 146).
These two questions are functionally identical, and the same p-value applies to both. Error! Reference
source not found. [C] shows the p-value for the interaction is 0.6648.
D
Figure 155 | Plot | Interaction of two categorical covariates
225
The full set
Is there a relationship between Time, Climate, and the interaction (as a set) and the effect size? Since
these three are the only covariates in the model, this is addressed by a test of the full model, as shown
in Figure 152 [E].
•
The p-value of 0.0016 allows us to reject the null hypothesis that none of the covariates is
related to effect size.
•
The R2 analog [F] is 0.66, which tells us that 66% of the initial between-study variance in effect
sizes can be explained by this combination of covariates.
•
Thus, we conclude that the full model (Time, Climate, and the interaction between them) is able
to explain at least some of the variance in effect size.
226
INTERACTION OF A CATEGORICAL COVARIATE WITH A CONTINUOUS
COVARIATE
The original data set includes the covariates Latitude and Year, both of which are continuous. For the
purpose of this example we created the following variables based on Latitude and Year (see Appendix 5:
Creating variables for interactions).
•
•
•
Year-C is the study year, centered
Hot is coded 0 for studies in Cold climates, and 1 for studies in Hot climate
Year-C x Hot is the interaction
Figure 156 shows the model, Figure 157 shows the main results. While the program can compute the
statistics for interactions it cannot plot these interactions, and therefore the plots (Figure 159, Figure
158, and Figure 160) were created in Excel™ (see appendix).
Figure 156 | Setup | Interaction of categorical and continuous covariates
227
A
B
C
C
D
E
Figure 157 | Main results | Interaction of categorical and continuous covariates
228
In two-way analysis of variance, we sometimes display the results as a 2x2 table, and we can do the
same here, as shown in Table 7.
Table 7
Climate
Cold
Hot
Year-C
-15
20
−1.0184
−1.9286
−0.0276
−0.3929
Instructions for creating this table are given in the appendix.
Once we have this table, it’s a simple matter to plot the main effects and their interactions as shown in
the following plots.
229
Hot
Does the vaccine’s effect differ as a function of climate? Since the equation includes the interaction
term, the impact of climate is not a main effect, but rather varies depending on the time frame. Here, it
will be evaluated for studies where Year-C = 0 (and the actual year is 1948) as indicated by the arrow in
Figure 158 [C].
When Year-C is zero (and the actual year is 1948) the effect size for the Cold studies (at the bottom of
the arrow) is substantially larger than the effect size for the Hot studies (at the top of the arrow). The pvalue for this difference is 0.0007 as shown in [Figure 157 C].
C
Figure 158 | Plot | Interaction of categorical and continuous covariates
230
Year-C
Did the impact of the vaccine change over time? Since the equation includes the interaction term, the
impact of Year is not a main effect. Rather, it is a first-order effect that may vary with climate.
In Figure 159,
The impact of Year for Cold studies (Hot=0) is indicated by the arrow labeled [B].
The impact of Year for Hot studies (Hot=1) is indicated by the arrow labeled [BB].
The impact of Year is tested for the Cold studies [B] since these are the studies coded 0 for Hot.
The effect size increases (moves away from zero) as we move from 1930 to 1970, but the corresponding
p-value for Year-C is 0.3430 as shown in [Figure 157 B]. Thus, there is no evidence that effect size is
related to year.
BB
B
Figure 159 | Plot | Interaction of categorical and continuous covariates
231
Hot x Year-C
Does the relationship between Year and effect size vary by Climate? In Figure 158 the regression line for
Year in Hot climates is not strictly parallel to the regression line for Year in Cold climates. However, the
the differences in slopes are minor.
Does the relationship between Climate and effect size vary by Year? The impact of Climate in 1968 is
only slightly larger than the impact of Climate in 1933 (as indicated by the difference in the height of the
two arrows in Figure 158).
These two questions are functionally identical, and the same p-value applies to both. Figure 157 [D]
shows the p-value for the interaction is 0.6343.
D
Figure 160 | Plot | Interaction of categorical and continuous covariates
232
The full set
Is there a relationship between Year-C, Hot, and the interaction (as a set) and the effect size? Since
these are the only covariates in the model, this is addressed by a test of the full model [Figure 157 E].
•
The p-value of 0.0008 allows us to reject the null hypothesis that none of the covariates is
related to effect size.
•
The R2 analog [Figure 157 F] is 0.68, which tells us that 68% of the initial between-study variance
in effect sizes can be explained by this combination of covariates.
•
Thus, we conclude that the full model (Time, Climate, and the interaction between them) is able
to explain at least some of the variance in effect size.
233
INTERACTION OF TWO CONTINUOUS COVARIATES
In this example we assess the impact of Year-C, Latitude-C, and Year-C x Latitude-C. For instructions on
creating the data set used here, see Appendix 5: Creating variables for interactions
•
•
•
Latitude-C is the latitude, centered
Year-C is the study year, centered
Latitude-C x Year-C is the interaction
Figure 161 shows the model, Figure 162 shows the main results, and Error! Reference source not found.
shows a plot of these results. While the program can compute the statistics for interactions it cannot
plot these interactions, and therefore Error! Reference source not found. was created in Excel™ (see
Plotting the interaction of two continuous covariates).
Figure 161 | Setup | Interaction of two continuous covariates
234
A
B
C
D
E
F
Figure 162 | Main results | Interaction of two continuous covariates
Working with the screen shown in Figure 162 we can create Table 8. Then we use the numbers in this
table to create the subsequent plots in Excel™. Details are provided in the appendix.
Table 8
Latitude-C
-20
0
+21
-15
0.1319
-0.6171
-1.4035
Year-C
20
-0.2796
-0.7835
-1.313
235
Latitude-C
Does the vaccine’s effect differ as a function of latitude? Since the equation includes the interaction of
Latitude and Year, the impact of latitude is not a main effect, but rather varies depending on the Year.
The impact of Latitude on the effect size will be evaluated for studies where Year-C = 0 (and the actual
year is 1948) as indicated by the arrow [B] in Figure 163.
When Year-C is zero (and the actual year is 1948) the effect size for the studies where Latitude is 55 (at
the bottom of the arrow) is substantially larger than the effect size for the studies where Latitude=13 (at
the top of the arrow).
At this Year there is a statistically significant relationship between latitude and effect size, with p =
0.0066 as shown in Figure 162 [B].
B
Figure 163 | Plot | Interaction of two continuous covariates
236
Year-C
Did the impact of the vaccine change over time? Since the equation includes the interaction of Year by
Latitude, the impact of Year is not a main effect. Rather, it is a first-order effect that may vary with
Latitude.
In Figure 164,
Since the equation includes the interaction term, the impact of Year-C will be evaluated at the point
where Latitude-C = 0 (and the actual latitude is 33). This is represented by Line [C] in Figure 164.
At this latitude the regression line seems to be relatively horizontal. The corresponding p-value for
Year–-C is 0.7594 as shown in Figure 162 [C]. Thus, there is no evidence that effect size is related to year
at this latitude.
C
Figure 164 | Plot | Interaction of two continuous covariates
237
Hot x Year-C
Does the relationship between Year and effect size vary by latitude? While the lines for the different
latitudes in Figure 165 are not strictly parallel, the differences in slopes are minor and not statistically
significant.
Does the relationship between latitude and effect size vary by Year? The impact of latitude in 1968 is
only slightly larger than the impact of latitude in 1933 (as indicated by the difference in the height of the
two arrows in Figure 165).
These two questions are functionally identical, and the same p-value applies to both. Figure 162 [D]
shows the p-value for the interaction is 0.7181.
Figure 165 | Plot | Interaction of two continuous covariates
238
The full set
Is there a relationship between Year-C, Latitude-C, and the interaction (as a set) and the effect size?
Since these are the only covariates in the model, this is addressed by a test of the full model, as shown
in Figure 162 [E].
•
The p-value of 0.0097 allows us to reject the null hypothesis that none of the covariates is
related to effect size.
•
The R2 analog [F] is 0.60, which tells us that 60% of the initial between-study variance in effect
sizes can be explained by this combination of covariates.
•
Thus, we conclude that the full model (Year, Latitude, and the interaction between them) is able
to explain at least some of the variance in effect size.
239
CURVILINEAR RELATIONSHIPS
Earlier, we established that there is a linear relationship between latitude and effect size. Suppose we
want to test the hypothesis that the relationship between latitude and effect size is actually curvilinear
– for example, that the vaccine’s impact is relatively constant as we move from a latitude of 13 to 30,
but then increases as we move from 30 to 55. Or, that the vaccine’s impact increases sharply as we
move from a latitude of 13 to 30, but is relatively unchanged beyond that point.
A curvilinear relationship can be seen as a kind of interaction, and that is the approach we take here. In
any interaction we ask of the impact of one covariate depends on the level of another covariate.
Typically, the two covariates are distinct (A and B). Here, they are the same (A and A) but the idea is the
same. We are asking if the impact of A depends on the level of A.
When working with curvilinear (or higher-order) relationships it’s generally a good idea to center
variables, and that’s the practice we follow here. To assess the hypothesis that there is a curvilinear
relationship between latitude and effect size we’ll need two covariates
•
•
Latitude-C is simply Latitude centered, to have a mean of zero.
Latitude-C2 is the square of Latitude-C.
For information on how to create these variables see Appendix 5: Creating variables for interactions.
Figure 166 shows the model, Figure 167 shows the main results, and Figure 168 shows a plot of these
results. While the program can compute the statistics for curvilinear relationships it cannot plot these
relationships, and therefore Figure 168 was created in Excel™ (see appendix).
Figure 166 | Setup | Curvilinear relationship
240
A
B
C
Figure 167 | Main results | Curvilinear relationship
241
Figure 168 | Plot | Curvilinear relationship
In this example, where the effects are in log units and range downward from zero, an effect size of zero
reflects no effect while an effect size of −1.5 reflects a substantial effect. In this example, as the latitude
increases the effect size increases (moves away from zero).
We want to evaluate the linear component of this relationship, the curvilinear component, and then the
two (as a set).
Latitude-C
The line for Latitude-C in Figure 167 [A] addresses the linear relationship between latitude and effect
size. The p-value is < 0.0001, which tells us that the linear relationship is statistically significant.
Note. Since we have included an interaction, the linear component varies as a function of latitude.
Therefore, the coefficient for Latitude-C is not a slope but rather the tangent to the curved line where
Latitude-C is zero.
Latitude-C2
The line for Latitude-C2 in Figure 167 [B] addresses the curvilinear component of the relationship
between latitude and effect size. The line in the plot is curvilinear, with a slope that is initially shallow
but increases as the altitude increases. Is this line a better fit for the data than a straight line would be?
This is addressed by the p-value for this covariate, which is 0.2600. There is no evidence that the
relationship is curvilinear.
Test of the model
Since the only covariates in the model are Latitude-C and Latitude-C2, the model [C] tests the null
hypothesis that both coefficients are zero. In Figure 167 [C] the Q-value is 17.90, and with 2 degrees of
242
freedom the p-value is 0.0001. We can conclude that there is a relationship between these covariates
(as a set) and the effect size.
243
PART 15: MISSING DATA
In regression for meta-analysis, as in regression for primary studies, there are many options for dealing
with missing data. The program takes a very simple approach to missing data, as follows. If a study is
missing data for the outcome or for any of the covariates in the covariate list that study is excluded from
the analysis.
Note that this exclusion is based on all covariates listed on the main screen, and not only on the
covariates that are checked.
Figure 169 | Setup
244
To show how the program handles missing data we need to create some missing data. For this purpose,
•
•
•
Highlight the three cells in the latitude column for studies 9, 10, 11 [A]
Press the Delete key
A
Figure 170 | Data-entry | Missing data for latitude
245
Run the analysis.
The main analysis screen (Figure 171) shows that the three studies are missing latitude [B]. These
studies are still included in this analysis of the mean effect.
B
Figure 171 | Basic analysis | Missing data for latitude
•
•
Proceed to the regression module
Create a model that includes latitude, and tick latitude, as in Figure 172 [C]
C
Figure 172 | Regression | Setup | Latitude in list and checked
246
The results are shown in Figure 173. Note that the analysis is based on 10 studies [D] rather than 13,
since three studies have been excluded.
D
Figure 173 | Regression | Main results | Missing data
247
•
•
To see which studies have been excluded click More results > All data
The program displays a line for every study in the database, and missing data points are
highlighted in red (Figure 174).
E
Figure 174 | Table of missing data
This is a good way to identify the missing data and also to identify patterns of missing data.
•
•
If data is missing primarily for one covariate across a lot of studies you may decide to remove
that covariate from the analysis.
If data is missing primarily for a few studies across many covariates you may decide to remove
those studies and keep the covariates.
Of course, the decision to adopt one of these approaches or some other will depend on a host of
factors, with attention paid to avoiding bias. However, the ability to identify the patterns of missing
data is a crucial first step in this process.
In some cases, you may want to use another approach for missing data. For example, you may want to
replace missing data with the mean, or a value imputed in some other way. You can do this by returning
to the main data-entry screen and simply entering the desired value in place of the missing value.
In a more sophisticated version of this scheme you can create several variables based on the same
variable, but using different approached to missing data. For example, suppose the initial variable is
Dose. You can create one variable called DoseA that replaces missing data with the mean, and another
variable called DoseB that replaces missing data with another imputed score. Then, in any given model
you would use one or the other, but not both.
248
Important
Missing data is based on all covariates in the list, and not only those that are checked. The program
works this way to ensure that if you define two or more prediction models, the same studies will be
used in all the models.
In Figure 175 we have un-checked Latitude [F], but the three studies are still excluded from the analysis
as we can see in Figure 176 [G].
F
Figure 175 | Setup | Latitude in list, unchecked
249
G
Figure 176 | Main results | Latitude in list, unchecked
In this example the studies are being excluded because they are missing a value for latitude. How can
we include these studies in the analysis?
•
If we want to use latitude as a predictor, the only option is to return to the data-entry screen
and enter a value for latitude for each of these studies.
•
If we are willing to run the regression without latitude, then we need to remove latitude from
the list of covariates on the main screen.
250
It is not sufficient to simply un-tick latitude. Rather, we must remove it as shown in Figure 177.
•
•
Highlight “Latitude” [G]
Click “Remove covariates” [H]
G
H
Figure 177 | Setup | Latitude must be removed from list
251
Figure 178 | Setup | Latitude removed from list
We now run the analysis again (Figure 179) and see that the number of studies [I] has returned to 13.
I
Figure 179 | Main results | Latitude removed from list
252
Note.
There are more sophisticated methods for handling missing data, such as multiple imputation and
selection models for non-ignorable missing data. While these are beyond the scope of this manual,
these and other methods can be implemented using CMA. You would use an external program to
determine the data value for each study, and then input this value via the data-entry screen.
In this chapter we assumed that the each study has been entered into the database as one row, and the
data for that row is either present or absent.
In Part 18: Complex data structures we show how to create data by combining data across subgroups,
outcomes, or time-points, and how missing data is handled in that case.
253
PART 16: FILTER STUDIES
In some cases you may want to run a regression using a subset of the data. For example, you may want
to limit the analysis to studies that employed acceptable methods for randomization and doubleblinding. Or, you may want to limit the analysis to studies that were performed within the past ten
years, or to studies that employed specific variants of the intervention or that enrolled persons from
specific populations.
This procedure is called “Filtering”, in that we create a filter, and only studies that pass through the filter
are submitted for the regression. The process is actually very simple, but requires that you understand
the relationship among three distinct modules in the program. This is shown schematically in the
following three figures.
1. In the data-entry module we enter data for all studies (Figure 180)
2. In the main analysis module we can create filters (Figure 181)
3. Studies that pass through the filters are submitted to the regression module (Figure 182)
Data-Entry
Figure 180 | Data entry
Basic-Analysis
All filtering is done here
Figure 181 | Basic analysis
254
Meta-regression
Figure 182 | Meta-regression
We provide a few examples of filtering
Example 1
Suppose you want to exclude two specific studies (Aaronson, Stein & Aaronson) by name
On the main analysis screen (Figure 183), right-click on the names and select “Select by Study name”
Figure 183 | Select by study name
In Figure 184, un-tick the two studies and click [Ok]
255
Figure 184 | Select by study name
The studies main analysis is now based on the remaining 11 studies. When you proceed to metaregression, only these studies will be transmitted.
256
Example 2
The process of excluding studies by name works well if you only need to exclude a few studies, but
becomes tedious and error-prone if you have a large database and need to exclude many studies. In this
case, there are better options.
Suppose that you want to run a series of analyses using a specific subset of the studies. Create a
categorical moderator (let’s call this Set-A) and code each study as belonging to this set (or not) as in
Figure 185 [A].
A
Figure 185 | Create a moderator for filtering
257
On the main analysis screen
•
•
•
•
Click Computation options > Select by
Click the tab for Moderator
Select Set-A
De-select “No”
The analysis is now based on the eight studies that had been coded “Yes”. The mean effect size is
−0.5894 [B].
B
Figure 186 | Filter by moderator
258
Click Analysis > Meta-regression 2
Run the regression
D
C
Figure 187 | Regression using a filter
The regression is based on these eight studies [C]. The mean effect size (the intercept) is −0.5894 [D].
You may create a series of these “Sets”, and easily switch between them. Run the regression for Set-A,
and then for Set-B.
259
Example 3
You can filter studies based on existing moderators. For example, suppose you wanted to run an
analysis using studies in a Hot climate that employed either Systematic or Random allocation.
•
•
•
Click Computational options > Select by >
Select Climate
Tick [Hot]
Figure 188 | Select by moderator
260
•
•
•
Select Allocation
Tick “Random” and “Systematic”
E
Figure 189 | Select by moderator
The analysis is now based on studies that meet both criterion, as shown in Figure 189 [E].
To ensure that things are working as intended, after running the regression click on All studies to see
which have been used in the regression (Figure 190).
Figure 190 | Filter by moderator
The studies in the regression are the same ones that were included in the main analysis when we
applied these filters (Figure 189).
261
PART 17: DEFINING SEVERAL MODELS
Typically you will create one prediction model, which is the list of covariates to be included in the
analysis. Then, you might try another model, and another, working one model at a time.
Figure 191 | Defining several models | Setup
The program offers another option – to define a number of prediction models at once, and then run
them all simultaneously. Here, for example, the user has defined one model that includes only the
intercept [A], a second that adds latitude [B], and a third that adds year [C].
A
B
C
Figure 192 | Defining several models | Setup
262
Then, when she runs the analysis the program runs all of the models as shown in Figure 193. The user
can switch among them by using tabs at the bottom of the screen [A].
A
Figure 193 | Defining several models | Main-analysis | Intercept
In Figure 193 the user has clicked on the tab labeled “Intercept” [A]. The screen displays the statistics
for the analysis based on intercept alone.
263
B
Figure 194 | Defining several models | Main-analysis | Intercept + year
In Figure 194 the user has clicked on the tab labeled “+Year” [B]. The screen displays the statistics for
the analysis based on intercept and year.
264
C
Figure 195 | Defining several models | Main-analysis | Intercept + year + latitude
In Figure 195 the user has clicked on the tab labeled “+Latitude” [C]. The screen displays the statistics
for the analysis based on intercept, year and latitude.
265
Why would we want to define more than one model?
In the running example we defined three models on the main screen. Why take this approach, rather
than simply working with one prediction model at a time? There are two reasons why this option may
be useful.
First, the program summarizes the results of all models on one screen (Figure 196). To navigate to this
screen click More results > [Compare models Detailed]
Figure 196 | Defining several models | Main-analysis | Intercept + year + latitude
266
Second, the program displays a test for the difference in the explanatory power of the models (Figure
197). For example, the cell indicated by [A] compares the model that includes intercept and latitude
with the one that adds year as well. This option is only available when one model is a subset of the
other. When this is not the case, the corresponding cell will be left empty.
Figure 197 | Defining several models | Main-analysis | Intercept + year + latitude
How do we choose what covariates to include in each model? This depends on the questions we want
For example, suppose the primary goal of the analysis is to assess the impact of treatment.
•
•
•
One series of covariates such as mean age and location is seen as noise
One series of covariates represents treatment condition
One series (such as dose by treatment) represents potential interactions
We might define one model as “Nuisance”, a second as “Plus Treatment”, and a third as “Plus
interactions”. Then, the summary screen provides a quick look at the three models while the
comparison screen shows the statistical tests of the differences among them.
267
Working with multiple predictive models
As shown in Figure 198, to create a series of prediction models, use the toolbar [A] to
•
•
•
•
Insert new models
Delete models
Rename models
Move models left or right
A
Figure 198 | Defining several models | Setup
268
Two common scenarios for multiple models are a diagonal sequence and an incremental sequence. The
program can generate these kinds of series automatically.
The incremental sequence is shown in Figure 199. The first model includes the intercept only, and then
one covariate is added at each step. The name for each variable is “plus” that variable, since the
variable has been added to the covariates.
To create this sequence click Generate sequence > Incremental sequence [A]
A
Figure 199 | Defining several models | Main-analysis | Intercept + year + latitude
The diagonal sequence is Figure 200. Each model includes the intercept plus one covariate. The name
for each variable is that variable alone.
To create this sequence click Generate sequence > Diagonal sequence [B]
B
Figure 200 | Defining several models | Setup| Year or Latitude
269
Note.
When you run multiple models, always be sure to select the desired tab at the bottom when studying
the results. These tabs control all tables that are model-specific.
In the earlier example we created models called “Intercept”, “+Year”, and “+Latitude”. Then we click on
“Scatterplot”. We need to select a tab at the bottom of the screen to select the model for the
scatterplot.
In Figure 201 we’ve clicked “+Year” [B] and the scatterplot is based on the intercept and latitude (as we
can see from the prediction equation [C]).
C
Figure 201 | Multiple predictive models | Plot based on Year
B
270
By contrast, in Figure 202 we’ve clicked “+Latitude” [D] and the scatterplot is based on the intercept,
latitude, and year, as we can see from the prediction equation [E].
E
Figure 202 | Multiple predictive models | Plot based on Year + Latitude
D
271
The same holds true for the most screens. Specifically, as shown in Table 9
•
Screens that present results for one predictive model will change as the user selects one or
another model using the tabs at bottom.
•
Screens that collate results for all one predictive models do not change as the user selects one
or another model using the tabs at bottom.
Table 9
Screen
Main results
Scatterplot
R2 graphic
Covariance
Correlation
Diagnostics
All studies
Valid studies
Increments
Models summary
Compare models (detailed)
Compare models (p-values)
Varies
by model
X
X
X
X
X
X
Identical for
all models
X
X
X
X
X
X
272
PART 18: COMPLEX DATA STRUCTURES
On the main data-entry screen every row usually represents a single study. If there are 20 studies on
the data-entry screen there will be 20 studies on the main analysis screen, and 20 studies transmitted to
the meta-regression module.
However, in the case of complex data sets the situation is a bit more complicated. There are two kinds
of complex data that we need to address.
•
•
One is the case where we include two or more independent subgroups for some (or all) studies.
The other is the case where we include two or more non-independent outcomes, time-points, or
comparisons for some (or all) studies.
The program has a mechanism in place for dealing with these studies. The key to this mechanism is that
all the data filtering and merging is performed on the main data-analysis screen. The rows displayed on
this screen are the rows that will be transmitted to the regression module. In the regression module
these rows will be treated as independent of each other.
INDEPENDENT SUBGROUPS WITHIN STUDIES
Consider the case where studies assess the impact of a drug, and report the results separately for males
and for females. To record data for Independent subgroups (where each subject appears in one
subgroup or the other, but not both) we use Insert > Column > for subgroups
Figure 203 | Data-entry | Complex data-structures
In this example we have five studies, and each reports the effect separately for males and females. For
illustrative purposes we’ve set the effect size 0.20 points higher for males vs. females, and we’ve set the
variance the same (0.10) for all subgroups for all studies.
273
Figure 204 | Data-entry | Complex data-structures
We have two options.
•
•
We can use the subgroup as the unit of analysis
We can combined subgroups for each study, and use study as the unit of analysis
Using subgroup as the unit of analysis
We can use subgroup as the unit of analysis. In this case the variance of the summary effects should be
0.10/10, or 0.01, and the standard error should be 0.10.
On the data-analysis screen we right-click on the column labeled “Subgroup within study” and select
“Use subgroup as the unit of analysis”
Figure 205 | Basic analysis | Subgroup within-study as unit of analysis
274
B
A
Figure 206 | Basic analysis | Subgroup within-study as unit of analysis
As expected, in Figure 206 the variance [A] is 0.01 and the standard error [B] is 0.10.
Now, we proceed to the meta-regression module.
275
For illustrative purposes we’ll run a regression with only the intercept
D
C
Figure 207 | Regression | Subgroup within-study as unit of analysis
•
•
In Figure 207 the number of studies in the analysis [C] is 10.
The standard error [D] is 0.10 (which implies a variance of 0.01).
These are the same numbers we saw in Figure 206.
276
Finally, we can click on More Results > All studies to display the data being used in the meta-regression
(Figure 208). We see that the program is, in fact, working with the same 10 units that we had seen on
the main analysis screen (Figure 206).
Figure 208 | Regression | Subgroup within-study as unit of analysis
Using study as the unit of analysis
Immediately above we used subgroup as the unit of analysis. We also have the option of using study as
the unit of analysis. In this case the program will merge the data for the subgroups within each study to
yield study-level data. This data will then be used in the main analysis, and also in the meta-regression.
Since we are working with independent subgroups within each study, the variance of the estimate
should be approximately the same regardless of whether we choose to use (a) subgroup or (b) study as
the unit of analysis.
Above, we saw that when we used subgroups as the unit of analysis the variance of each subgroup was
0.10, and so the variance of the combined effect was 0.10/10=0.01.
When we use study as the unit of analysis the variance within each study is 0.10/2 (for two subgroups),
or 0.05. Then, the variance of the combined effect is 0.05/5, which (as before) is 0.01.
277
To see this, let’s return to the main analysis screen and select Use Study as the unit of analysis
Figure 209 | Basic analysis | Study as unit of analysis
B
A
Figure 210 | Basic analysis | Study as unit of analysis
In Figure 210 variance for each study is 0.05, the variance for the combined effect [A] is 0.01, and the
standard error [B] is 0.10. These are the same numbers we saw in Figure 206.
278
Again, we can proceed to meta-regression and run the analysis with only the intercept.
D
C
Figure 211 | Regression | Study as unit of analysis
•
•
In Figure 207 the number of studies in the analysis [C] is 5.
The standard error [D] is 0.10 (which implies a variance of 0.01).
These are the same numbers we saw in Figure 210.
279
Finally, we can navigate to More Results > All Data. Figure 212 shows that we now have five studies
rather than 10 subgroups, and the variance for each study is 0.05.
Figure 212 | Regression | Study as unit of analysis
The point is, when we have independent subgroups within studies, every subgroup yields independent
information, and must be treated as such in the analysis. We may elect to use subgroup as the unit of
analysis or we may elect to use study as the unit of analysis, but in either case the within-study variance
for the combined effect should be approximately the same. It is in the main analysis, and it is in the
regression.
If the two options yield the same result here, does it matter which one we use? Yes, it does.
In this example we created a homogeneous set of studies for illustrative purposes, and tau-squared was
zero. When tau-squared is zero, the two approaches yield very similar (if not identical) results.
By contrast, in a real analysis the estimate of tau-squared will often be different if based on 10
subgroups as compared with 5 studies. Concretely, if the effect sizes tend to vary a lot from one study
to the next, but to be relatively similar for the subgroups within a study, it follows that tau-squared
based on studies will tend to be larger while tau-squared based on subgroups will tend to be smaller.
The decision to use on or the other depends on how you see the sampling frame, and the population to
which you want to generalize. If you want to get a sense how the effects are distributed across studies,
then it makes sense to use study as the unit of analysis. If you want to get a sense of how the effects are
distributed across subgroups, then it makes sense to use subgroups as the unit of analysis.
As noted above, the precision for estimating the mean effect will likely be similar in the two cases.
However, the estimate of the variance itself (and statistics that depend on this estimate) will differ.
This applies to both the main analysis and also to the regression.
For regression, there is an additional consideration, as follows.
280
Suppose that the subgroups are Male and Female. If you intend to use gender as a covariate, then the
only option would be to use subgroup as the unit of analysis.
Even if you don’t intend to use gender as a covariate, using subgroup as the unit of analysis may allow
you to work with a finer level of data. For example, suppose you have the mean age for each subgroup,
and plan to use mean age as a covariate. If you use subgroup as the unit of analysis you can use the
mean age for each subgroup. By contrast, if you use study as the unit of analysis you’ll need to use the
mean age for the study, and any difference in age between subgroups will be lost.
281
MULTIPLE OUTCOMES OR TIME-POINTS
Above, we addressed the case where we have independent subgroups within studies. The key was that
each subgroup represented independent information (a person was included in one subgroup or
another, but not both) and was treated as such in the analysis.
Now, we turn to the case where we have multiple outcomes or time-points within studies. The key here
is that the rows for each study are based (at least partly) on the same persons, and do not provide
independent information. We’ll use the example of multiple outcomes and then comment on the issue
of multiple time-points below.
To highlight the difference between independent subgroups on the one hand, and multiple outcomes on
the other, we’ll use the same data as before. However, this time we’ll identify the rows within each
study as being for two outcomes (reading and math) rather than for two subgroups (male and female).
When creating the data file we use Insert > Column for > Outcome names (Figure 213).
Figure 213 | Data-entry | Multiple outcomes
The data are shown in Figure 214.
Figure 214 | Data-entry | Multiple outcomes
Proceed to the main analysis screen
282
The program initially shows an analysis for Math only (Figure 215).
Figure 215 | Basic analysis | Multiple outcomes | Select one outcome
Right-click on Outcome to see the following options (
Figure 216)
•
•
•
Use the mean of the selected outcomes
Use all of the selected outcomes, assuming independence
Use the first outcome, based on this sequence
Figure 216 | Basic analysis | Multiple outcomes | Use all outcomes, assuming independence
Here, we select “Use all of the selected outcomes, assuming independence”.
283
We present this option here to illustrate its impact, and not to suggest that this is generally a valid
option. In fact, we would consider using this option only when (a) there is only minor overlap in the
samples and/or (b) the correlation between outcomes is small. Otherwise, this option will underestimate the variance (over-estimate the precision) of the summary effect size.
By selecting this option we are treating the correlation between outcomes as zero, which is (in effect)
what we did with two independent subgroups. It follows that the results should be the same as they
had been before. In fact, in Figure 217 the combined effect has a variance [A] of .01 and a standard
error [B] of .10, which are the same numbers we saw in Figure 206.
B
A
Figure 217 | Basic analysis | Multiple outcomes | Use all outcomes, assuming independence
As before, we’ll run a meta-regression with only the intercept (Figure 218), to show that the regression
will yield the same results as the traditional analysis.
Figure 218 | Multiple outcomes | Setup
284
The results (Figure 219) shows a standard error [C] of 0.1000, with an implied variance of 0.0100, the
same numbers as we saw in Figure 217.
C
Figure 219 | Multiple outcomes | Use all outcomes, assuming independence
285
Finally, More results > All data displays the actual rows of data in the analysis (Figure 220), which are
identical to those in Figure 217.
Figure 220 | Multiple outcomes | Use all outcomes, assuming independence
The point of this exercise was to show that when we treat the outcomes as independent of each other,
the impact is the same as when we are working with independent subgroups. The program treats every
line of data as though it is providing new (unique) information. While this is usually appropriate for
independent subgroups, it is rarely appropriate for multiple outcomes. We will usually want to treat the
outcomes as dependent, and compute the variance accordingly, as shown here.
Back on the main analysis screen (Figure 221) select “Use the mean of the selected outcomes” [A]
286
A
Figure 221 | Basic analysis | Multiple outcomes | Use mean of outcomes
In Figure 222 the program combines the data for math and reading to yield a “Combined” score for each
study [B].
B
D
C
Figure 222 | Basic analysis | Multiple outcomes | Use mean of outcomes
In computing the variance for each study the program assumes a correlation of 1.0 between reading and
math. The variance for reading is 0.10, the variance for math is .10, and the variance for the study-level
composite is also 0.10. This follows from our decision to treat the correlation between reading and
math as 1.0. This means that the second outcome is providing no new information (and so has no
impact on the variance). This approach is conservative, in the sense that the true correlation is usually
287
less than 1; the second outcome probably provides some new information; and the true variance is
probably lower than the value we are using.
As always, once we have a variance for each study we can compute the variance for the combined effect
size. In Figure 222 the variance of the combined effect[C] is 0.10/5, or 0.02. This is twice as large as the
value in Figure 217, which treated the outcomes as independent of each other. Similarly, the standard
error of the combined effect [D], computed as the square root if the variance, is now 0.1414, as
compared with 0.1000 in Figure 217.
288
The results for the regression (Figure 223) are identical to those for the traditional analysis (Figure 222).
In particular, the standard error [E] is 0.1414, which implies a variance of 0.0200.
E
Figure 223 | Multiple outcomes | Use mean of outcomes
289
Figure 224 | Multiple outcomes | Use mean of outcomes
Finally, we can click More results > All studies to see data rows that are being transmitted to the
regression (Figure 224). There are five rows, and the variance for each is 0.10, precisely as we had seen
in Figure 222.
Other options
Our goal here was to show how the transformations and filtering specified in the main analysis are
carried over to the regression. For this purpose we outlined two options, to treat the outcomes as
independent or to combine them into a composite score assuming a correlation of 1.0.
The program includes a number of other options for working with multiple outcomes. These are not
discussed here, but could be used in the main analysis screen and would carry over to the regression.
These include selecting one outcome in preference to the others, and taking the mean of some
outcomes while excluding others. While the program always assumes a correlation of 1 when it
automatically generates composite scores, you can create composite scores manually using any
correlation.
290
How covariates are merged
To this point we’ve shown what happens with the effect size and variance when we are working with
multiple subgroups or outcomes within a study. Now, we need to consider what happens with the
covariates. For example, suppose the subgroups are male and female, and each has a value for “Age”. If
we merge the data from the two subgroups, what do we use for “Age”?
Independent subgroups within a study
The two options are
•
•
To use subgroup as the unit of analysis
To use study as the unit of analysis
If we use subgroup as the unit of analysis then each subgroup has its own value for each covariate, and
this value gets transmitted along with the effect size.
However, if we use study as the unit of analysis then we need one value for the study, and we need to
consider how this value is created. The rules are as follows.
•
•
If both subgroups have the identical value (for example, age is coded as 40) then this same value
is assigned to the study.
If the value differs from one subgroup to the next within a study (one subgroup has age 40 and
another has age 50) the covariate is transmitted as missing, and the study will be excluded from
any regression that uses this covariate (see section on missing data).
You can use the “All studies” table to see where data are missing. If appropriate, you can return to the
data-entry screen and modify the data (perhaps use age 45 for both).
The same idea is applied to multiple outcomes.
•
If we use all outcomes then each outcome has its own value for each covariate, and this value
gets transmitted along with the effect size.
•
If we use the mean of outcomes then we need one value for the study. If both outcomes have
the identical value (for example, age is coded as 40) then this same value is assigned to the
study. If the value differs (one outcome has age 40 and another has age 50) the covariate is
transmitted as missing, and the study will be excluded from any regression that uses this
covariate (see section on missing data).
You can use the “All studies” table to see where data are missing. If appropriate, you can return to the
data-entry screen and modify the data (perhaps use age 45 for both).
Multiple time-points
The case of multiple time-points works the same as multiple outcomes.
291
MULTIPLE COMPARISONS
Here, we use the term “multiple comparisons” to refer to a study where the same control group serves
as the comparator for a series of Drugs.
If you will look at one drug at a time, then the Control-group data appears only once in each analysis, in
which case one could argue that each analysis is valid on its own. However, if you will look at analysis
that compares all treatments vs. the control, then we need to address the fact that the same control
group will appear more than once in the data.
There are several options that can be taken
•
•
One option is to use the same approach discussed for multiple outcomes.
Another option is to divide the control group into segments when entering the data, and assign
each segment to one treated group. For example, if the Control group has 20 events over 100
people, treat is as being two control groups with 10 events and 50 people in each. This option is
explored in “Introduction to Meta-Analysis”.
292
PART 19: SOME CAVEATS
293
STATISTICAL POWER FOR META-REGRESSION
Statistical power is the likelihood that a test of significance will reject the null hypothesis. In the case of
meta-regression it is the likelihood that the Z-test of a single covariate or the Q-test of a set of covariates
will yield a statistically significant p-value.
Power depends primarily on two factors,
•
•
The magnitude of the effect
The precision with which we measure the effect
In a fixed-effect analysis precision is driven primarily by the total number of individual subjects across all
studies. In a random-effects analysis precision is driven by the total number of individual subjects across
all studies, and also by the variance in treatment effects and the total number of studies. (cite Hedges
and Pigott, 2004?)
While there is a general perception that power for testing the main effect is consistently high in metaanalysis, this perception is not correct, and certainly does not extend to tests of effects in metaregression. (Hedges and Pigott, 2001) The failure to find a statistically significant p-value in metaregression could mean that the effect (if any) is quite small, but could also mean that the analysis had
poor power to detect even a large effect. One should never use a non-significant finding to conclude
that a covariate (or a set of covariates) is not related to effect size.
294
MULTIPLE COMPARISONS
In primary studies researchers often need to address the issue of multiple comparisons. The basic
problem is that if we conduct a series of analyses with alpha set at 0.05 for each, then the overall
likelihood of a type I error (assuming that the null is actually true) will exceed 5%. In the context of
regression, this could be a problem if we test a series of covariates.
Some suggest simply allowing for a 5% error rate for each covariate. To use this approach we can simply
work with the p-values on the main results screen. In this case it’s a good idea to evaluate the data in
context. For example, one significant p-value in forty tests would be suspect.
Some suggest conducting an omnibus test that asks if there are any non-zero effects, and then
proceeding to look at pair-wise comparisons only if the initial test meets the criterion for significance.
To implement this approach we could first look at the p-value for the model. Or, if some of the
covariates are nuisance variables (potential confounds that we want to hold constant) while others are
of interest (say, a set of covariates that represent the treatment condition) we could test the increment
for this set and use this as the gateway p-value.
Others suggest going straight to the individual covariates but using a stricter criterion for significance
(for example, a criterion alpha of 0.01 rather than 0.05 for five tests).
Others (e.g., Rothman 1990) suggest that there are many cases where we can safely ignore the problem
of multiple comparisons.
The same issues exist in meta-analysis and the approaches outlined above for primary studies can be
applied to meta-analysis as well. (e.g., Hedges and Olkin, 1985).
295
296
PART 20: TECHNICAL APPENDIX
297
APPENDIX 1: THE DATASET
The motivating example in this book is the BCG dataset.
•
•
The Excel™ version is called BCG.xls
The CMA version is called BCG.cma
The original dataset includes only a few moderators (Latitude, Year, Allocation) as shown in Figure 225.
We needed to create some additional moderators for purposes of this book. Some of these are shown
in Figure 226 and all of these are listed in Table 10.
Table 10 also shows how we can create the new moderators using Excel. If we start with the dataset in
CMA, it is possible to create new moderators directly in CMA, but the process can become tedious. An
easier approach is to copy the relevant block of data from CMA to Excel™, add the new columns in
Excel™, and then copy these new columns back into CMA. Then, in CMA, we just need to assign a name
to each new column, identify it as a moderator, and assign a type (Categorical, Integer, or Decimal).
Table 10 shows the formulas that were employed in Excel™ to assign values to each study for the new
variables.
Figure 225 | BCG Data in Excel™
298
Figure 226 | BCG Data in Excel™
299
Table 10
Column
Variable
Type
Formula for Row 4
The following are the original variables
J
K
L
Latitude
Year
Allocation
Integer
Integer
Categorical
Variables related to latitude
We compute the mean Latitude, which is 33.46154, in cell J18
LatitudeC is Latitude minus the mean
LatitudeC2 is the square of LatitudeC
Climate is a categorical variable, coded either Hot (latitude < 40) or Cold.
Hot and Cold are dummy variables based on Climate.
M
N
O
P
Q
LatitudeC
LatitudeC2
Climate
Hot
Cold
Integer
Integer
Categorical
Integer
Integer
=K4-\$J\$18
=M4^2
=IF(J4<40,"Hot","Cold")
=IF(\$O4="Hot",1,0)
=IF(\$O4="Cold",1,0)
Variables related to Year
We compute the mean Year, which is 1948.0769, in cell L18
YearC is Year minus the mean
Time is a categorical variable, coded either Early (Year < 1945) or Recent.
Early and Recent are dummy variables based on Climate.
R
S
T
U
YearC
Time
Early
Recent
Decimal
Categorical
Integer
Integer
=K4-\$K\$18
=IF(K4<1945,"Early","Recent")
=IF(\$S4="Early",1,0)
=IF(\$S4="Recent",1,0)
Interactions
Hot x YearC is the interaction of Hot by Year (centered)
Hot x Recent is the interaction of Hot by Recent
LatitudeC x YearC is the interaction of Latitude (centered) by Year (centered)
V
W
X
Hot x YearC
Hot x Recent
LatitudeC x YearC
Decimal
Integer
Decimal
=P4*R4
=P4*U4
=M4*R4
The “IF” formulas in Excel™ take the source column and recode it. Alternatively, you can simply type in
the new values.
Normally, you do not need to create the dummy variables such as Hot and Cold. Rather, you can enter
categorical variables into the regression and the program will create the dummy variables on the fly.
You do need to create the dummy variables (as shown here) if you will be running regression without
the intercept, or if you plan to use them in interactions.
300
In Figure 226, row 2 indicates if each variable is categorical, integer, or decimal. In CMA, when you
define a variable as a moderator, you must also specify that it is one of these types.
A note on Categorical and Dummy variables
A categorical variable is a variable where the codes represent distinct groups, and it makes sense to say
that the groups are different from each other, but not that one is “more than” another. A common
example is Gender. Usually, each study is assigned a text value (Male or Female) for the variable.
A categorical variable cannot be used directly in the regression. Rather, we create dummy variables that
contain the same information as the categorical variables, and use these in the analysis.
In most cases, the program is able to create the dummy variables on the fly. Concretely, if you enter a
categorical variable into the analysis, the program will create one (or more) dummy variables, assign a
code for each study, and enter these dummy variables into the analysis.
When this approach can be used, it offers the advantage of speed and simplicity. However, this
approach can only be used in simple regressions. This approach cannot be used if
•
•
•
We want to omit the intercept from the prediction model
We want to use effects-coding, contrast-coding, or some other scheme
We want to create interaction terms that include the categorical variable
In any of these cases, we need to create the dummy codes manually and enter the data as we would for
any other study. Note that the dummy variables must be identified as integer or decimal, and not as
categorical. For example,
•
•
Climate is categorical (the values are Cold and Hot). However, the variable called Cold is an
integer variable (the values are 0 and 1) and the variable called Hot is an integer variable (the
values are 0 and 1).
Time is categorical (the values are Early and Late). However, the variable called Early is an
integer variable (the values are 0 and 1) and the variable called Recent is an integer variable(the
values are 0 and 1).
301
APPENDIX 2: UNDERSTANDING Q
Introduction
In a simple meta-analysis we compute Q, a number that reflects the dispersion of all effect sizes about
the mean effect size. Then we use Q to compute an array of indices such as T, T2, and I2, that address
specific aspects of dispersion. We briefly review the computation of Q and its relationship to these
indices. Then we proceed to the primary focus of this appendix, to show that the computation of Q and
the other indices is fundamentally the same for a simple analysis (with one group of studies), for a
subgroups analysis, and for a regression. In all three cases we compute Q by working with the deviation
of effect sizes from the predicted effect size. In a simple analysis the predicted effect size is the mean of
all studies, in a subgroups analysis it is the mean of the studies within the subgroup, and in the
regression it is the relevant point on the regression line.
Background
Q is a measure of dispersion on a standardized scale. We use Q to address the question “Do all studies
share a common effect size, Y”. To understand how Q works, consider the following.
Suppose that we have a single study and want to ask, “Is the true effect size for this study equal to Y?”
We could compute a Z-value for the study using
Z=
X −Y
.
SE X
(1.34)
where X is the observed effect size, Y is the predicted effect size, and SEX is the standard error of the
effect size for this study.
To test the hypothesis that the true effect for the study is Y, and that the observed deviation (X−Y) is due
to sampling error, we could compare the observed Z to the Z distribution. If the observed Z exceeds 1.96
(or −1.96) we might conclude that the true effect size is probably not Y.
Alternatively, we could have performed the same analysis by working with Z2. The square of Z is called
Q, and the observed Q value is evaluated with reference to the chi-squared distribution with 1 df. If the
observed Q exceeds 3.84 (that is, 1.962) we might conclude that the true effect size is probably not Y.
In this example we wanted to test the hypothesis that the true value for one study is Y. The same logic
can be extended to multiple studies, and this is exactly what happens when we test the assumption of
homogeneity. For example, suppose that the meta-analysis includes five studies, and we want to test
the assumption of homogeneity, that the true effect size for each of the five studies is Y. We compute Z2
for each study and sum these values over all studies to yield Q. We then evaluate Q with reference to
the chi-squared distribution with df equal to the number of studies minus one (here, df = 4)
We’ve elected to present Q in this way in order to highlight its relationship to Z. Researchers
understand that Z is on a standardized scale, and that, if all studies share a common true effect size,
302
then observed effect sizes with deviations of 1.96 or 2.58 on this scale occur in 5% or 1% of cases,
respectively. From this perspective, it’s easy to understand how Q can serve the same function.
In most texts (including this one) Q is also described as a “Weighted sum of squares”. This description
highlights the fact that the deviations are weighted, and emphasizes the fact that the deviation of X
from Y (the observed effect from the predicted effect) gets more weight in a more precise study
(typically a larger study) than it does in a less precise (smaller) study.
To compute a weighted sum of squares we would compute each deviation and then weight it by the
inverse variance. Thus, for any single study
Q = ( X − Y )2 ×
1
V
(1.35)
That is, we start with the deviation (squared) and then weight that value by the inverse of the withinstudy variance.
This equation (1.35) highlights the fact that studies with lower variance (typically the larger studies) are
assigned greater weight. The earlier equation (1.34) highlighted the fact that the Q-values are on a
standardized scale (squared). But in fact, the two formulas are algebraically equivalent.
To see this, note that (1.35) is in squared values. If we take the square root for both sides of the
equation we get
Z = ( X −Y )×
or
Z=
1
SE X
( X −Y )
.
SE X
(1.36)
(1.37)
which is identical to (1.34).
In either case (whether we use (1.34) or (1.35)) we need to sum the Q values over all studies. Hence the
term “WSS”, or weighted sum of squares.
Once we have computed Q, we can use it for a number of purposes.
First, we can test the Q value for statistical significance. If we are working with the fixed-effect model,
the model requires that all studies share a common effect size. If the Q-value is statistically significant,
we conclude that the studies do not share a common effect size, and this assumption has been violated.
We probably should not be using the fixed-effect model.
303
Figure 227 | Flowchart showing how T2 and I2 are derived from Q
Second, we can use Q to estimate the amount of between-study variance and related indexes. Since Q
follows the chi-squared distribution, the expected value of Q is equal to df. It follows that if the
observed value of Q exceeds the df, the excess can be attributed to between-study variance. This value
(Q – df), is the basis for the following.
Variance and standard deviation of true effects
To compute T2, the between-studies variance, we start with (Q – df). We multiply this difference by a
factor (called C) that puts it into the metric of the effect size, squared. If T2 is the variance of effect sizes
between studies then its square root, T, is the standard deviation of effect sizes between studies.
Ratio of observed variance to true variance
To compute I2, the ratio of true to total variance, we start with (Q – df). If Q reflects the total WSS and
(Q – df) reflects the WSS between studies, then the ratio (Q – df)/Q is by definition the ratio of true/total
variance, called I2. By convention we multiple I2 by 100 and express it as a percentage (0% to 100%).
Proportion of true variance explained by the predictors
In primary studies, R2 is the proportion of variance explained by predictors. In meta-regression the
analog to R2 is the proportion of true variance explained by predictors, based on the explained variance
as a proportion of the original variance.
Important
It is important to note that Q is computed in exactly the same way (using weights based solely on withinstudy variance) regardless of whether we are working with a fixed-effect or a random-effects model.
The only difference between the models is in how we interpret Q and the associated statistics.
304
•
Under the fixed-effect model, Q addresses the statistical model. If Q is statistically significant
the statistical model is not valid.
•
Under the random-effects model, Q addresses the prediction model. If Q is statistically
significant the prediction model is incomplete, meaning that some of the true variance in effects
is unexplained.
All of these statistics are based on Q, which is based on the deviation of observed from predicted
effects. Below, we show how this works for the three general cases we called (A) one group of studies,
(B) two subgroups of studies, and (C) a regression. The key point we want to emphasize with these
examples is that these cases are all fundamentally the same.
Case A
The computation of Q and derivative statistics for the simple case (the overall mean) is shown here
First, we run the regression using fixed-effect weights Figure 228.
Figure 228 | Case-A | Main results | Fixed-effect weights
This yields the prediction equation
Y = −0.4303 .
(1.38)
305
Figure 229 | Case-A | Computing Q
In Figure 229 we use (1.38) to compute the weighted squared deviation for each study and sum these to
yield Q, which is 152.2329 (allowing for rounding error).
Working with Q we then compute T2, the variance of true effect sizes in the sample. It is given by
=
T2
Q − df 152.2329 − 12
=
= 0.3088
454.1808
C
(1.39)
In this equation df is the number of studies minus the number of predictors (the intercept), and C is a
conversion factor that allows us to move from the standardized scale of Q to the metric of the effect
size. This estimate of T2 is then employed to assign random-effect weights to all studies and re-run the
regression. The result is shown in Figure 230, where the prediction equation is
Y = −0.7141 .
(1.40)
306
Figure 230 | Case-A | Main results |Random-effects
The standard deviation of true effect sizes, T, given by
=
T =
T2
0.3088
= 0.5557
(1.41)
If we assume that the true effects are normally distributed about their predicted value (and that the
predicted value is correct), then 95% of all studies would have true effects in the range given the
predicted value plus/minus 1.96 T.
Under the random-effects model the predicted effect for all studies is −0.714, and so the true effects
would fall in the range of
LL =
−0.7141 − 1.96 × 0.5557 =
−1.8032
(1.42)
UL =
−0.7141 + 1.96 × 0.5557 =
0.3750
(1.43)
These are the values we use to create the graphic. The normal curve is centered at −0.7141 and extends
roughly from 0.3750 to −1.8032, intended to reflect the true effect size in some 95% of relevant studies.
(This simplified example assumes that the mean is known. To actually compute prediction intervals we
would take account of the error in estimating the mean.)
307
M
L
N
Figure 231 | Case-A | Dispersion of effects about regression line
We can also compute I2, the ratio of true variance to total variance. This is given by
=
I2
Q − df 152.33 − 12
=
= 92.12%
Q
152.33
(1.44)
Typically we can also compute R2, the proportion of variance explained by the covariates. In this case
there are no covariates, and so this is not applicable.
308
Case B
When we turn to case B (two subgroups) all of the formulas remain the same. The only thing that
changes is that the predicted value is now different for studies in each subgroup. (Also, the conversion
factor C is based on a different formula).
First, we run the regression using fixed-effect weights Figure 228.
Figure 232 | Case-B | Main results | Fixed-effect weights
This yields the prediction equation
Y=
−0.9986 + 0.8870 × Hot ,
(1.45)
where Hot is 0 for cold climates and 1 for hot climates.
309
Figure 233 | Case-B | Computing Q
In Figure 229 we use (1.38) to compute the weighted squared deviation for each study and sum these to
yield Q, which is 152.2329 (allowing for rounding error).
Working with Q we then compute T2, the variance of true effect sizes in the sample. It is given by
T2
=
41.7894 − 11
Q − df
=
= 0.0964
319.4579
C
(1.46)
In this equation df is the number of studies minus the number of predictors (the intercept and climate),
and C is a conversion factor that allows us to move from the standardized scale of Q to the metric of the
effect size. This estimate of T2 is then employed to assign random-effect weights to all studies and rerun the regression.
310
Figure 234 | Case-B | Main results |Random-effects
The result is shown in Figure 234, where the prediction equation for cold studies is
Y =−1.1987 + 0 × 0.9203 =−1.1987
(1.47)
Y =−1.1987 + 1 × 0.9203 =−0.1116
(1.48)
and for hot studies is
The standard deviation of true effect sizes, T, given by
=
T =
T2
0.0964
= 0.3105
(1.49)
If we assume that the true effects are normally distributed about their predicted value (and that the
predicted value is correct), then 95% of all studies would have true effects in the range given the
predicted value plus/minus 1.96 T.
For cold climates, 95% of true effects would fall in the range of
LL =
−0.9986 − 1.96 × 0.3105 =
−1.6071
(1.50)
UL =
−0.9986 − 1.96 × 0.3105 =
−0.3901
(1.51)
311
For hot climates, 95% of true effects would fall in the range of
LL =
−0.1116 − 1.96 × 0.3105 =
−0.7201
(1.52)
UL =
−0.1116 + 1.96 × 0.3105 =
+0.4969
(1.53)
These are the values we use to create the graphic shown in Figure 237.
N
O
Figure 235 | Case-B | Dispersion of effects about regression line
For cold studies the normal curve is centered at −0.9996 and extends roughly from −1.6071 to −0.3901,
intended to reflect the true effect size in some 95% of relevant studies. For hot studies the normal curve
is centered at −0.1116 and extends roughly from −0.7201 to +0.4969, intended to reflect the true effect
size in some 95% of relevant studies. (This simplified example assumes that the mean is known. To
actually compute prediction intervals we would take account of the error in estimating the mean.)
We can also compute I2, the ratio of true variance to total variance. This is given by
2
I=
Q − df
41.7894 − 11
=
× 100
= 73.68%
Q
41.7894
(1.54)
We can also compute R2, the proportion of variance explained by the covariates. The unexplained
variance with only the intercept is 0.3088 and the unexplained variance with the intercept plus climate
is 0.0964, which means that the variance explained by the intercept is
2
TExplained
= 0.3088 − 0.0964 = 0.2124
(1.55)
Then, R2 is the explained variance as a proportion of the total variance, or
312
2
TExplained
0.2124
=
R
=
= 0.6878
2
TTotal
0.3088
2
(1.56)
or .6878.
Case C
Finally, we can apply exactly the same approach to compute Q for the case where we use latitude to
predict the effect size.
First, we run the regression using fixed-effect weights (Figure 236).
Figure 236 | Case-C | Main results | Fixed-effect weights
This yields the prediction equation
Y = 0.3436 − 0.0292 × Latitude ,
(1.57)
where Latitude is an absolute value.
313
Figure 237 | Case-C | Computing Q
In Figure 237 we use (1.38) to compute the weighted squared deviation for each study and sum these to
yield Q, which is 30.7339 (allowing for rounding error).
Working with Q we then compute T2, the variance of true effect sizes in the sample. It is given by
=
T2
Q − df 30.7339 − 11
=
= 0.0633
311.7368
C
(1.58)
In this equation df is the number of studies minus the number of predictors (the intercept and climate),
and C is a conversion factor that allows us to move from the standardized scale of Q to the metric of the
effect size. This estimate of T2 is then employed to assign random-effect weights to all studies and rerun the regression.
314
Figure 238 | Case-C | Main results |Random-effects
The result is shown in Figure 239, where the prediction equation for all studies is
Y = 0.2595 − 0.0292 × Latitude
(1.59)
The standard deviation of true effect sizes, T, given by
=
T =
T2
0.0633
= 0.2516
(1.60)
If we assume that the true effects are normally distributed about their predicted value (and that the
predicted value is correct), then 95% of all studies would have true effects in the range given the
predicted value plus/minus 1.96 T.
315
F
G
H
Figure 239 | Case-C | Dispersion of effects about regression line
In the figure we show the range of effects for three arbitrary points on the regression line. For studies
at a latitude of 20, the predicted effect size is
Y =
0.2595 − 0.0292 × 20 =
−0.2404
(1.61)
Then, 95% of true effects would fall in the range of
LL =
−0.2404 − 1.96 × 0.2516 =
−0.7335
(1.62)
UL =
−0.2404 + 1.96 × 0.2516 =
0.2527
(1.63)
Thus, for studies at a latitude of 20 the normal curve is centered at −0.2404 and extends roughly from
−0.7335 to −0.2527, intended to reflect the true effect size in some 95% of relevant studies. (This
simplified example assumes that the mean is known. To actually compute prediction intervals we would
take account of the error in estimating the mean.)
We can also compute I2, the ratio of true variance to total variance. This is given by
2
I=
Q − df
30.72339 − 11
=
× 100
= 64.21%
Q
30.7339
(1.64)
We can also compute R2, the proportion of variance explained by the covariates. The unexplained
variance with only the intercept is 0.3088 and the unexplained variance with the intercept plus climate
is 0.0964, which means that the variance explained by the intercept is
2
TExplained
= 0.3088 − 0.0633 = 0.2455
(1.65)
316
Then, R2 is the explained variance as a proportion of the total variance, or
2
TExplained
0.2455
=
R
=
= 0.7950
2
TTotal
0.3088
2
(1.66)
or .7950.
317
APPENDIX 3: TESTS OF HETEROGENEITY
In any analysis, whether based on the fixed-effect model or the random-effects model, tests of
heterogeneity are based on fixed-effect weights. Our goal in this section is to explain why.
Consider Figure 240, which presents results for a simple analysis, with one set of studies.
C
D
Figure 240 | Heterogeneity statistics in basic analysis
At the left the program shows the analyses that address the effect size. One line presents results based
on fixed-effect weights and the other presents results based on random-effects weights. Since the study
weights on the first row are based on V while those on the second row are based on V + T2, the effect
size, the standard error, and all other statistics differ from one line to the next.
In the section on heterogeneity one might similarly expect to see two sets of statistics, one based on
fixed-effect weights and the other on random-effects weights. In fact, though there is only one set of
statistics, based on fixed-effect weights, that applies to both statistical models.
The reason has to do with the nature of the null hypothesis for heterogeneity, and the Q statistics that
we use to test this hypothesis. Recall that Q is the WSS, or the weighed sum of squared deviations. That
is,
1.
2.
3.
4.
We take the deviation of every effect size from the mean effect size
We square that deviation
We weight the squared deviation
We sum this value across all studies
Under the fixed-effect model,
In step 1 we compute the deviation of every effect size from −0.4303. In step 3 we weight each squared
deviation by 1/V. Computed in this way, Q is 152.2330. With 12 degrees of freedom the corresponding
p-value is < 0.0001
Similarly, under the random-effects model,
In step 1 we compute the deviation of every effect size from −0.4303. In step 3 we weight each squared
deviation by 1/V. Computed in this way, Q is 152.2330. With 12 degrees of freedom the corresponding
p-value is < 0.0001.
318
Thus, the test is identical under the two models because under the null there is no variation in true
effects, which means that T2 is zero. It follows that −
1. In step 1, when we compute the deviation of each effect size from the mean effect size, the
mean effect size we need to use is −0.4303 (the fixed-effect estimate) rather than −0.7141 (the
random-effects estimate). If T2 is zero, the weighted mean would be −0.4303
2. In step 3, when we weight each squared deviation to get the weighted sum of squares (Q), the
weights must be based on V. Again if T2 is zero then it has no impact on the weights
Thus, the expected variation in effects under the null (that T2 is zero) is identical whether we are using
the fixed-effect model or the random-effects model. Therefore, the test of this null is identical under
both models.
If it still seems odd that we’re using fixed-effect weights to test the null for a random-effects model, this
might help. Rather than thinking of the test as using fixed-effect weights based on V, think of it as using
random-effects weights based on V+ T2, where (under the null) T2 happens to be zero.
Of course, the same logic holds if we use a regression approach.
Figure 241 | Heterogeneity statistics in regression
319
Figure 242 | Heterogeneity statistics in regression
Figure 259 and Figure 260 show results for a fixed-effect analysis and a random-effects analysis using
the same data as Figure 240.
•
•
In the fixed-effect analysis the residual Q is 152.2330 with df = 12 and p < 0.0001.
In the random-effects analysis the Goodness of fit test shows Q is 152.2330 with df = 12 and p <
0.0001.
These numbers are identical to each other.
While the test for heterogeneity is identical under the FE and the RE models, the way that we use the
test results depends on the model.
•
Under the FE model a significant p-value for heterogeneity tells us that the data are not
consistent with the model. There is no reason to compute T2 or I2, since these are assumed to
be zero. For this reason, regardless of whether or the p-value is statistically significant Figure
241 shows no estimate of T2 nor of I2.
•
Under the RE model a significant p-value tells us that there is empirical evidence of
heterogeneity, while a non-significant p-value tells us that this evidence is not present.
Whether or not the p-value is statistically significant, we proceed to compute T2 which is
incorporated into the weighting scheme. For this reason, regardless of whether or the p-value is
statistically significant, in Figure 260 the line that displays Q also displays an estimate of T2 and
I2.
To this point we’ve established that in a simple analysis, the test of the null that T2 is zero is identical for
fixed-effect and for random-effects models. The same logic applies when we move on to more complex
analyses, such as analyses that involve subgroups or continuous covariates.
Figure 261 shows an analysis to assess the relationship between effect size and climate
320
As was true for the simple analysis, hypothesis tests that involve the relationship between subgroups
and climate do depend on the statistical model.
Under the fixed-effect model the difference in subgroup means is based on computations where studies
are weighted by 1/V. The Q-value is 110.4436, with 1 df and p < 0.0001.
Under the random-effects model the difference in subgroup means is based on computations where
studies are weighted by 1/V+ T2. The Q-value is 15.5445, with 1 df and p = 0.0001.
By contrast, when we turn to the question of heterogeneity within subgroups, the program presents
only one set of statistics. As before, these statistics are based on Fixed-effect weights and are displayed
in the fixed-effect section.
Figure 243 | Heterogeneity statistics with subgroups
The null hypothesis for heterogeneity is that studies within subgroups share the same effect size. To
test the null that T2 is zero we compute Q and df within subgroups and sum these values across
subgroups. In this case, for the Cold and Hot subgroups Q is 20.3464 plus 21.4431 for a total of 41.7894,
while degrees of freedom are 5 plus 6 for a total of 11. A Q value of 41.7894 with 11 df yields a p-value
of < 0.0001, and we conclude that there probably is some variation in true effects, even within
subgroups.
As was true for the simple analysis, the computation of Q is identical under the FE and the RE models
since in both cases T2 under the null is zero. When T2 is zero the mean effect size in each group is based
on study weights of V rather than V + T2, and the weights assigned to compute the weighed sum of
squares are based on V rather than V + T2.
To this point we’ve shown that in a simple analysis and also in a subgroups analysis, the test of
homogeneity is identical for the fixed-effect and the random-effects models. The same applies also to a
regression analysis with continuous covariates.
Figure 262 and Figure 263 show a regression analysis using latitude as a covariate, for the fixed-effect
model and the random-effects model, respectively.
321
Under the null hypothesis for homogeneity, T2 is zero for all studies with the same predicted effect size
(here for studies at the same latitude). And, as before, this if the null is true then study weights are
based on 1/V for both models. This means that under the null hypothesis for heterogeneity, the
regression line for the RE model will be identical to the regression line under the FE model and so the
squared deviation of each study from the regression line will be the same. And, the weights employed to
compute WSS will be the same under the two models. So, the Q value will be identical under the two
models.
For that reason the statistics for the residual Q in Figure 244 and the statistics for Goodness of fit in
Figure 263 are identical to each other.
Figure 244 | Heterogeneity statistics with subgroups
322
Figure 245 | Heterogeneity statistics with subgroups
323
Figure 246 | Heterogeneity statistics with continuous covariate
324
Figure 247 | Heterogeneity statistics with continuous covariate
To avoid confusion we should emphasize that this section deals only with the issue of heterogeneity,
and not with the issue of effect sizes.
•
When we are focusing on the effect size itself, the test of the null does depend on the statistical
model. Under the FE model we pose the null that the effect size is zero and test this null
assuming a variance of V. Under the RE model we pose the null that the effect size is zero, and
test this null assuming a variance of V + T2. (The same idea carries through to the more complex
cases).
•
By contrast, when we are focusing on T2 itself, as we do in this chapter, the test of the null is the
test that T2 is zero, and as such is identical for both statistical models.
Finally, while the computation of Q is identical for the FE and the RE model, the implications of the Q
statistic, and what we do with this statistic, do depend on our assumptions about the statistical model.
325
As explained earlier, the Q-value can be used to test the null hypothesis for statistical significance. It can
also be used to compute T2 (the variance of true effect sizes) and to compute I2 (the proportion of
observed variance that represent differences in real effects rather than sampling error).
If we have adopted the fixed-effect model, a significant Q-value tells us that the data are not consistent
with the model. Whether or not the Q-value is statistically significant we do not compute T2 nor I2, since
under the FE model these are defined as being zero.
If we have adopted the random-effects model, then the fact that Q is (or is not) statistically significant
has little real bearing. The model allows for variance in true effects and is valid whether or not this
variance exists. Whether or not the Q-value is statistically significant we do compute T2 (which is
employed to assign weights) and I2 (which helps us to describe the distribution of effects).
326
APPENDIX 4: COMPUTING Τ2 IN THE PRESENCE OF SUBGROUPS
When we’re working with subgroups, it’s clear that we need to estimate the between-study variance T2
within subgroups rather than for the full set of studies. In our example, when we’re estimating the
mean effect for Hot studies and for Cold studies, the between-study variance that we need to assign
weights and to discuss the unexplained variance is clearly the variance within subgroups.
However, there are two ways to estimate τ2 within subgroups.
One option is to compute one estimate of τ2 for the Hot studies, and a separate estimate for the Cold
studies. Then, we would use each estimate for the corresponding set of studies.
The other option is to compute one estimate of τ2 for the Hot studies and a separate estimate for the
Cold studies. Then we would pool the two estimates and use the pooled estimate for both sets of
studies.
The logic for choosing the second option is that estimate of τ2 are not reliable unless they’re based on a
large number of studies. Unless we have good reason to believe that τ2 will differ substantially from one
subgroup to the next, it’s often a better idea to assume that the true value of τ2 is comparable for each
subgroup, and we’ll get a better estimate of this common value by pooling the within-subgroups
estimates.
This is the option that we use with meta-regression since (at least when we’re using continuous
covariates) the first option is not tenable. For purposes of comparing the subgroups analyses with the
regression, we must select this option for subgroups as well.
On the analysis screen
•
•
•
Select Computational options > Mixed and random effects options
Select the option to Assume a common among-study variance across subgroups
Select the option to Combine subgroup using a fixed-effect model
Figure 248 | Computing τ2 in the presence of subgroups
327
Figure 249 | Computing τ2 in the presence of subgroups
Figure 250 | Computing τ2 in the presence of subgroups
In this example we would pool the within-subgroup estimates of T2 and apply the pooled estimate to
both subgroups. The program does not display the pooled estimate on this screen. To see the pooled
estimate click Next table and Calculations. The pooled estimate is 0.0964.
328
Figure 251 | Computing τ2 in the presence of subgroups
For those who are interested, we show how to actually pool the estimates of T2 to get the pooled
estimate, and how to display the pooled estimate. The mechanism for pooling is not to take the mean
of the two estimates, but rather to pool the underlying statistics and then computed T2 from the
combined data.
Recall that
Q − df
C
(1.67)
∑ Q − ∑ df
∑C
(1.68)
T2 =
and so to compute a pooled value we use
T2 =
where values are summed across all subgroups. To get the within-subgroup values prior to pooling
•
•
•
Select “Do not assume a common among-study variance component”.
Click Fixed
Click Calculations
329
In Figure 252
•
•
•
Column C shows within-subgroup values of 110.9413 and 208.5166
Column Q shows within-subgroup values of 20.3464 and 21.4431
Column Q df shows within-subgroup values of 5 plus 6
C
Q
df
Cold
110.9413
20.3464
5
Hot
208.5166
21.4431
6
Sum
319.4579
41.7895
11
Figure 252 | Computing τ2 in the presence of subgroups
=
T2
41.7895 − 11
= 0.0964 ,
319.4579
(1.69)
which is the same value we saw in Figure 251.
Be sure to switch the option back to “Assume a common value” and to re-select the tab for Random.
330
APPENDIX 5: CREATING VARIABLES FOR INTERACTIONS
In general, it’s a good idea to create all the variables you’ll need on the data-entry screen before
proceeding to the analysis. However, if you’re in the middle of the analyses and discover that you need
In some cases it’s easiest to enter data for the new variables manually. This may be the case if you have
only a few studies and the computation of the new data points is simple (for example, each study is
coded 0 or 1).
In other cases, however, it’s easier to copy the data out to Excel, create the data for the new variables,
and then copy the data back into CMA. We show that process here.
Suppose we’re working with the data set shown in Figure 253, which includes the variable Latitude. At
some point we realize that we need a variable called Latitude-C (centered) and one called Latitude-C2
(centered, then squared) for the analyses.
Figure 253 | Creating variables for interactions
331
•
Figure 254 | Creating variables for interactions
Insert a column called Latitude-C and define it as Moderator > Decimal
Insert a column called Latitude-C2 and define it as Moderator > Decimal
Figure 255 | Creating variables for interactions
•
•
Click on the Latitude Column
Click Edit > Copy with Header
332
Open Excel™
•
•
•
•
•
Paste the column into Column A
Define Cell A18 as =AVERAGE(A3:A15)
Define Cell B3 as =A3-\$A\$18 and copy to other rows
Define Cell C3 as =B3^2 and copy to other rows
Copy columns B and C (rows 3 to 15) to the clipboard
Figure 256 | Creating variables for interactions
•
•
•
Click on Row-1 in the Latitude-C column
Click CTRL-V to paste the data
Save the file with the new data
Figure 257 | Creating variables for interactions
333
APPENDIX 6: PLOTTING A CURVILINEAR RELATIONSHIP
The spreadsheet used in this example is [Plot of curvilinear relationship.xlsx]
In chapter Part 14: Interaction we showed how to use Latitude and Latitude2 to predict effect size. The
program will not plot a curvilinear relationship, so we need to create the plot in Excel™.
In this example we used Latitude-C (centered) and Latitude-C2 as covariates. Figure 258 shows the
results of the analysis. To create the plot we need the coefficients [A] from this figure.
A
Figure 258 | Plotting a curvilinear relationship
In Excel™, create columns as shown in Table 11 and Figure 259.
Table 11
Column
Column A is a constant
Column B is the intercept (B0)
Column C is Latitude-C
Column D is coefficient for Latitude-C
Column E is Latitude-C, squared
Column F is coefficient for Latitude-C2
Column G is blank
Column H is latitude, un-centered
Column I is the predicted effect size
Comment
All studies have a value of 1
The value of −0.5501 comes from Figure 258
Values range from −20 to +20
The value of −0.0316 comes from Figure 258
This is the square of column C
The value of −0.0008 comes from Figure 258
This is column C plus the mean latitude, 33.46
For row 3 this is =A3*B3+C3*D3+E3*F3
334
Figure 259 | Plotting a curvilinear relationship
In rows 3-43 we enter data for a sequence of possible studies. There are 41 rows, where the value of
Latitude-C (centered) ranges from −20 to +20. Note that these are not the actual studies in our analysis,
but rather the approximate range of values for Latitude-C in our analysis.
335
At this point column H holds the latitudes while Column I holds the predicted values. We can use these
two columns to create a plot.
The instructions here are for Microsoft Excel™ 2010. The specific commands may vary slightly for other
versions.
•
•
•
Highlight the two columns from row 3 to the bottom (Figure 260)
Select Insert > Scatter > Scatter with smooth lines
The program creates this plot (Figure 261)
Figure 260 | Plotting a curvilinear relationship
Figure 261 | Plotting a curvilinear relationship
336
You may want to customize the graph as follows
Layout
Legend
None
Primary vertical axis
Primary vertical axis
More options > Vertical axis crosses > Axis value −1.8
More options > Number > Number > Decimal places > 2
Axes
Chart title
Effect size as a function of Latitude and Latitude2
Above chart
Axis title
Primary horizontal axis title
Primary vertical axis title
Title below Axis > Latitude
Rotated Title > Log Risk Ratio
At this point the plot should look like Figure 262
Log risk ratio
Effect size as a function of Latitude and
Latitude2
0.00
-0.20
-0.40
-0.60
-0.80
-1.00
-1.20
-1.40
-1.60
0
10
20
30
40
50
60
Latitude
Figure 262 | Plotting a curvilinear relationship
Notes
The key to plotting a non-linear relationship is that we create a row for every value of Latitude-C. Then
we computed the corresponding value of the square, and the predicted value. The same idea can be
extended for higher-order relationships as well.
We could have plotted effect size as a function of Latitude-C, in which case the X-axis would run from
−20 to +20. Instead, we un-centered the predictor (adding the mean latitude to each centered value),
and so the axis runs from 0 to 60. Critically, this edit takes place only at the last step. The prediction
equations are based on the centered scores.
337
APPENDIX 7: PLOTTING INTERACTIONS
In this appendix we show how to plot three kinds of interactions
•
•
•
The interaction of two categorical covariates
The interaction of one categorical and one continuous covariate
The interaction of two continuous covariates
All three follow basically the same format in Excel™. The only important difference is that for the first
two we select the Line graph, and for the third we select the Scatter graph.
338
Plotting the interaction of two categorical covariates
In chapter we ran an analysis to assess the interaction of Climate by Time. Here, we show how to plot
this interaction using Excel™.
The spreadsheet used in this example is [Plot of hot x time.xlsx]
The original data set includes the covariates Latitude and Year, both of which are continuous. For
purposes of this discussion we need two categorical covariates, and so we create them by dichotomizing
Latitude and Year.
•
•
•
Hot is coded 1 if the latitude is 34 or less, and is coded 0 if the latitude is exceeds 34.
Recent is coded 1 if the Year is 1945 or later, and is coded 0 if the Year is earlier than 1945.
Hot x Recent is created by multiplying Hot x Recent.
The results of the analysis are shown in Figure 264. To create the plot we’ll need the covariate names
and coefficients from this figure. Copy these to Excel™ columns B and C as shown in Figure 265.
Figure 263 | Plotting interaction of two categorical covariates
339
Figure 264 | Plotting interaction of two categorical covariates
Figure 265 | Plotting interaction of two categorical covariates
To create the plot we need two points (Early and Recent) for the Cold studies
•
•
Column E: Cold and Early. The X values are Hot (0), Recent (0), Interaction (0).
Column F: Cold and Recent. The X values are Hot (0), Recent (1), Interaction (0).
To create the plot we need two points (Early and Recent) for the Hot studies
•
•
Column H: Hot and Early. The X values are Hot (1), Recent (0), Interaction (0).
Column I: Hot and Recent. The X values are Hot (1), Recent (1), Interaction (1).
340
The predicted value for each column (E, F, H, I) is given by multiplying each X value by the corresponding
coefficient in column C, and then summing across the four rows.
For example, the formula for E10 is =E5*\$C\$5+E6*\$C\$6+E7*\$C\$7+E8*\$C\$8
To create the plot we need to identify the cells E10 and F10 as endpoints for the Cold studies, and H10
and I10 as endpoints for the Hot studies.
Insert > Graph > Line > With markers
Design > Select data > Add Series
Series name
Series X-values
Series Y-Values
H2
H3:I3
H10:I10
Hot
Early, Recent
−0.2164, −0.3035
Design > Select data > Add Series
Series name
Series X-values
Series Y-Values
E2
E3:F3
E10:F10
Cold
Early, Recent
−1.1154, −1.4416
Layout > Axis
Primary vertical axis
Primary vertical axis
Primary vertical axis
Primary vertical axis
Axis options > Minimum > Fixed > −2.0
Axis options > Maximum > Fixed > 0.5
Horizontal Axis Crosses > −2.0
Number > Number > Decimal places > 2
Layout > Chart Title
Above chart
Log risk ratio as a function of climate, year, interaction
Format
Series > Hot
Series > Cold
Format selection > Line Style > Dash type > Solid
Format selection > Line Style > Dash type > Dashed
341
Figure 266 | Plotting interaction of two categorical covariates
342
Plotting the interaction of a categorical covariate by a continuous covariate
The spreadsheet used in this example is [Plot of hot x year-c.xlsx]
In Part 14: Interaction we ran an analysis to assess the interaction of Year-C by Climate.
Since Climate is a categorical variable, the actual analysis covariates are Year-C and Hot, and the
interaction is Year-C by Hot. Following our convention, Hot is coded 0 for Cold studies and 1 for Hot
studies. The interaction is the product of the two. The results of the analysis are shown in Figure 267
To create the plot we’ll need the covariate names and coefficients fromFigure 267. In Excel™, copy
these to columns B and C as shown in Figure 268.
Figure 267 | Plotting interaction of categorical by continuous covariates
343
Figure 268 | Plotting interaction of categorical by continuous covariates
To create the plot we need two points (1933 and 1968) for the Cold studies
•
•
Column E: Cold and 1933. The X values are Hot (0), Year-C (−15), Interaction (0).
Column F: Cold and 1968. The X values are Hot (0), Recent (20), Interaction (0).
To create the plot we need two points (1933 and 1968) for the Hot studies
•
•
Column H: Hot and 1933. The X values are Hot (1), Year-C (−15), Interaction (−15).
Column I: Hot and 1968. The X values are Hot (1), Year-C (20), Interaction (20).
The predicted value for each column (E, F, H, I) is given by multiplying each X value by the corresponding
coefficient in column C, and then summing across the four rows.
For example, the formula for E10 is =E5*\$C\$5+E6*\$C\$6+E7*\$C\$7+E8*\$C\$8
344
Note
The columns are labeled 1933 and 1968, since these are the values we want to display in the plot.
Critically, these are not the values we use to compute the effect size. Rather, the X values in the data
columns are the corresponding centered values, −15 and 20.
To create the plot we need to identify the cells E10 and F10 as endpoints for the Cold studies, and H10
and I10 as endpoints for the Hot studies.
Insert > Graph > Line > With markers
Design > Select data > Add Series
Series name
Series X-values
Series Y-Values
H2
H3:I3
H10:I10
Hot
1933, 1968
−0.0275, −0.3929
Design > Select data > Add Series
Series name
Series X-values
Series Y-Values
E2
E3:F3
E10:F10
Cold
1933, 1968
−1.0184, −1.9286
Layout > Axis
Primary vertical axis
Primary vertical axis
Primary vertical axis
Primary vertical axis
Axis options > Minimum > Fixed > −2.5
Axis options > Maximum > Fixed > 0.5
Horizontal Axis Crosses > −2.5
Number > Number > Decimal places > 2
Layout > Axis
Primary horizontal axis
Primary horizontal axis
Primary horizontal axis
Primary horizontal axis
Axis options > Minimum> Fixed > 1930
Axis options > Maximum > Fixed > 1970
Vertical Axis Crosses > 1930
Number > 0
Layout > Chart Title
Above chart
Log risk ratio as a function of climate, year, interaction
Format
Series > Hot
Series > Cold
Format selection > Line Style > Dash type > Solid
Format selection > Line Style > Dash type > Dashed
345
Figure 269 | Plotting interaction of categorical by continuous covariates
346
Plotting the interaction of two continuous covariates
In this example we assess the impact of Year-C, Latitude-C, and Year-C x Latitude-C.
The spreadsheet used in this example is [Plot of latitude-c x year-c.xlsx]
•
•
•
Year-C is the study year, centered
Latitude-C is the latitude, centered
Year-C x Latitude-C is the interaction
The results of the analysis are shown in Figure 270
To create the plot we’ll need the covariate names and coefficients from Figure 270. In Excel™, copy
these to columns B and C as shown in Figure 271.
347
A
B
C
Figure 270 | Plotting interaction of continuous covariates
Figure 271 | Plotting interaction of continuous covariates
348
We will plot effect size for three latitudes (13, 33, 55). For each of these latitudes, we need the effect
size at two years (1933, 1968)
To create the plot we need two points (1933 and 1968) for Latitude 13
•
•
Column E: Latitude=13, 1933. The X values are Latitude-C (−20), Year-C (−15), Interaction (300).
Column F: Latitude=13, 1968. The X values are Latitude-C (−20), Year-C (20), Interaction (-400).
To create the plot we need two points (1933 and 1968) for Latitude 33
•
•
Column E: Latitude=33, 1933. The X values are Latitude-C (0), Year-C (−15), Interaction (0).
Column F: Latitude=33, 1968. The X values are Latitude-C (0), Year-C (20), Interaction (0).
To create the plot we need two points (1933 and 1968) for Latitude 55
•
•
Column E: Latitude=55, 1933. The X values are Latitude-C (21), Year-C (−15), Interaction (-315).
Column F: Latitude=55, 1968. The X values are Latitude-C (21), Year-C (20), Interaction (420).
The predicted value for each column (E, F, H, I,K,L) is given by multiplying each X value by the
corresponding coefficient in column C, and then summing across the four rows.
For example, the formula for E10 is =E5*\$C\$5+E6*\$C\$6+E7*\$C\$7+E8*\$C\$8
Note
The columns are labeled with the years 1933 and 1968, since these are the values we want to display in
the plot. Critically, these are not the values we use to compute the effect size. Rather, the X values in
the data columns are the corresponding centered values, −15 and 20.
Similarly, the columns are labeled with the latitudes 13, 33, and 55, since these are the values we want
to display in the plot. Critically, these are not the values we use to compute the effect size. Rather, the
X values in the data columns are the corresponding centered values, −20, 0, and 21.
To create the plot we need to identify the cells E10 and F10 as endpoints for the Latitude=13 studies,
H10 and I10 as endpoints for the Latitude=33 studies, and K10 and L10 as endpoints for the Latitude=55
studies.
349
Insert > Scatter > Smooth lines
Design > Select data > Add Series
Series name
E2
Series X-values
E3:F3
Series Y-Values
E10:F10
Design > Select data > Add Series
Series name
H2
Series X-values
H3:I3
Series Y-Values
H10:I10
Design > Select data > Add Series
Series name
K2
Series X-values
K3:L3
Series Y-Values
K10:L10
Lat=13
1933, 1968
0.1318,−0.2796
Lat=33
1933, 1968
−0.6171,−0.7835
Lat=55
1933, 1968
−1.4035,−1.3126
Layout > Axis
Primary vertical axis
Primary vertical axis
Primary vertical axis
Primary vertical axis
Axis options > Minimum > Fixed > −2.5
Axis options > Maximum > Fixed > 0.5
Horizontal Axis Crosses > −2.5
Number > Number > Decimal places > 2
Layout > Axis
Primary horizontal axis
Primary horizontal axis
Primary horizontal axis
Primary horizontal axis
Primary horizontal axis
Axis options > Minimum> Fixed > 1930
Axis options > Maximum > Fixed > 1970
Vertical Axis Crosses > 1930
Number > 0
Number > Use 1000 separator > No
Layout > Chart Title
Above chart
Log risk ratio as a function of climate, year, interaction
Format
Series > Lat=13
Series > Lat=33
Series > Lat=55
Format selection > Line Style > Dash type > Solid
Format selection > Line Style > Dash type > Dashed
Format selection > Line Style > Dash type > Dashed
350
Figure 272 | Plotting interaction of continuous covariates
351
APPENDIX 8: INTERPRETING REGRESSION COEFFICIENTS
Here, we are working with studies where the treatment is associated with lower risk. Given the coding,
most risk ratios are less than 1.0 and (equivalently) most log risk ratios are negative. Therefore, a
negative coefficient implies that a higher score is associated with a lower risk (the log risk ratio becomes
more negative, and the risk ratio moves further from 1.0).
By contrast, if we were working with studies where the risk tended to be greater than 1 (for example the
“risk” of living longer) then the log risk ratio would tend to be positive. Then, 0 is no effect, +1 is a large
effect, and +2 is a very large effect. Therefore, a positive coefficients means that as the covariate gets
larger the vaccine is more effective.
The coefficient for Year is 0.0235, which means that for every increase of one year the log risk ratio will
increase by 0.0235 (the vaccine became less effective over time). The corresponding p-value is 0.1390.
352
APPENDIX 9: META-REGRESSION IN STATA
In this section we show the correspondence between results produced by CMA and those produced by
the stata macro “metareg”
353
Figure 273 | CMA | Intercept + Year + Latitude + Allocation | Z | Method of moments
Figure 274 | Metareg| Intercept + Year + Latitude + Allocation | Z | Method of moments
354
Figure 275 | CMA | Allocation | Z | Method of moments
Figure 276 | Metareg | Allocation | Z | Method of moments
355
Figure 277 | CMA | Allocation, Year | Z | Method of moments
Figure 278 | Metareg | Allocation, Year | Z | Method of moments
356
Figure 279 | CMA | Intercept, Year-C, Year-C2 | Z | Method of moments
Figure 280 | Metareg | Intercept, Year-C, Year-C2 | Z | Method of moments
357
REFERENCES
Berkey, C.S., Hoaglin, D.C. Mosteller, F., and Colditz, G.A. (1995) A random-effects regression model for
meta-analysis. Statistics in Medicine, 14, 395-411.
Borenstein, M., Hedges, L.V., Higgins, J.P.T., Rothstein, H. (2009) Introduction to Meta-Analysis.
Chichester: Wiley.
Cohen J., Cohen P., West S.G., Aiken, L.S. (2003) Applied Multiple Regression/Correlation Analysis for the
Behavioral Sciences 3rd Edition. Mahwah: Lawrence Erlbaum Associates
Colditz, G.A., Brewer, F.B., Berkey, C.S., Wilson, E.M., Burdick, E., Gineberg, H.V., and Mosteller, F.
(1994). Efficacy of BGC vaccine in the prevention of tuberculosis. Journal of the American Medical
Association 271, 698-702.
Egger. M, Smith, G.W., Altman, D.G. (2001) Systematic Reviews in Health Care: Meta-Analysis in Context.
(2nd Edition) London: BMJ Books
Hartung, J., Knapp, G., Sinha, B.K., (2008) Statistical Meta-Analysis with Applications. Hoboken: Wiley.
Hedges, L.V., and Olkin I. (1985) Statistical Methods for Meta-Analysis. Boston: Academic Press.
Hedges, L. and Pigott, T.D. (2001) The power of statistical tests in meta-analysis. Psychological Methods
6, 203-217.
Hedges, L. and Pigott, T.D. (2004) The power of statistical tests for moderators in meta-analysis.
Psychological Methods 9, 426-445.
Higgins, J.P.T., and Thompson, S.G. (2004) Controlling the risk of spurious findings from meta-regression.
Statistics in Medicine, 23, 1663-1682.
Rothman, K.J. (1990). No Adjustments are Needed for Multiple Comparisons, Epidemiology, 1, 43-46.
Sutton, A.J., Abrams, K.R., Jones, D.R., Song, F. (2000) Methods for Meta-Analysis in Medical Research.
Chichester: John Wiley and Sons
358
```