M tlif di ti Meta-analysis of diagnostic accuracy studies y

M tlif di ti Meta-analysis of diagnostic accuracy studies y
Dr Mariska Leeflang
Dept. Clinical Epidemiology, Biostatistics and
Bioinformatics
Academic Medical Center, University of Amsterdam
Room J1B – 210
PO Box 227700
1100 DE Amsterdam
m.m.leeflang@amc.uva.nl
Meta-analysis
M
t
l i off diagnostic
di
ti
accuracy
y studies
Mariska Leeflang
(with thanks to Yemisi Takwoingi, Jon Deeks and Hans Reitsma)
1
Di
Diagnostic
ti T
Testt A
Accuracy R
Reviews
i
1.
Framing the question
2.
Identification and selection of
studies
3
3.
Quality assessment
4.
Data extraction
5.
Data analysis
6.
Interpretation
p
of the results
2
Ulti t goall off meta-analysis
Ultimate
t
l i
Robust conclusions with respect
to the research question(s)
3
M t A l i
Meta-Analysis
1.
Calculation of an overall summary
(average) of high precision, coherent with
all observed data
2.
Typically a “weighted average” is used
where
h
more informative
i f
ti
(larger)
(l
) studies
t di
have more say
3.
Assess the
A
th degree
d
to
t which
hi h the
th study
t d
results deviate from the overall summary
4.
Investigate possible explanations for the
deviations
4
Th ((meta-)analytic
The
t )
l ti process
1.
What analyses
anal ses did you
o plan?
a. Primary objective
b. Subgroups, sensitivity analyses, etc.
2.
What are the data at hand?
a. Forest plots
b Raw
b.
R
ROC plots
l t
c. Variation in predefined covariates?
3.
Is meta-analysis
meta analysis appropriate?
a. Sufficient clinical/methodological homogeneity
b. Enough studies per review question
4.
Meta-analysis
5
S
Summary
off which
hi h values?
l
?
Disease
(Ref test)
(Ref.
Sensitivity
p
y
Specificity
Positive Predictive Value
Index
Test
Pres.
Abs.
+
TP
FP
-
FN
TN
Negative Predictive Value
Positive Likelihood Ratio
N
Negative
i
Lik
Likelihood
lih d R
Ratio
i
Diagnostic Odds ratio
ROC curves
6
P li sensitivity
Pooling
iti it and
d specificity?
ifi it ?
7
P li sensitivity
Pooling
iti it and
d specificity?
ifi it ?
8
P li Lik
Pooling
Likelihood
lih d R
Ratios?
ti ?
9
P li LR
Pooling
LRs?
?
10
P li odds
Pooling
dd ratios?
ti ?
11
Let’s focus on sensitivity and specificity




Predictive values are directly depending
on prevalence
Pooling likelihood ratios may lead to
misleading / impossible results
Pooling odds ratios may be okay, but
are difficult to interpret.
From the pooled sensitivity and
specificity, it is still possible to calculate
LRs and PVs.
PVs
12
D
Descriptive
i ti A
Analysis
l i

Forest plots


p
point
estimate with 95% CI
paired: sensitivity and specificity sideby side
13
14
D
Descriptive
i ti A
Analysis
l i


Forest plots
 point estimate with 95% CI
 paired:
i d sensitivity
iti it and
d specificity
ifi it sideid
by side
ROC plot
 pairs of sensitivity & specificity in ROC
space
 bubble plot to show differences in
precision
15
Plot in ROC Space
1.0
True p
positive ra
ate
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
False positive rate
0.8
1.0
16
Diff
Different
t Approaches
A
h

P li
Pooling
separate
t estimates
ti
t


Summary ROC model


Not recommended
Traditional approach
approach, relative simple
More complex models


Bivariate random approach
Hierarchical summary
y ROC approach
pp
17
Threshold effects
Decreasing
threshold
increases
sensitivity but
decreases
specificity
Fetal fibronectin
f
.2
0
Increasing
threshold
increases
specificity but
decreases
sensitivity
sensitivityy
.4
.6
.8
1
for predicting spontaneous birth
1
.8
.6
.4
specificity
.2
0
18
Implicit and explicit threshold effects


Explicit threshold: different thresholds
are used for test positivity
Implicit threshold: there is no or only
one threshold,
th
h ld b
butt iin some cases tests
t t
are earlier regarded as positive than in
other cases
19
Explicit threshold: (ROC) curve
The ROC curve
represents the
relationship
between the true
positive rate (TPR)
and the false
positive rate (FPR)
of the test at
various thresholds
used to distinguish
disease cases from
non-cases.
Deeks, J. J BMJ 2001;323:157-162
20
I li it th
Implicit
threshold
h ld
ELISA for invasive
aspergillosis; cutoff value 1.5 ODI.
21
Di
Diagnostic
ti odds
dd ratios
ti
Ratio of the odds of positivity in the diseased to the
odds of positivity in the non-diseased
TP  TN
Diagnostic OR 
FP  FN
 sensitivity 


1  sensitivity  LR  ve

DOR 

 1  specificity  LR  ve


 specificity 
22
Di
Diagnostic
ti odds
dd ratios
ti
Cervical Cancer
(Biopsy)
HPV
Test
Present
Absent
+
65
93
158
-
7
161
198
72
254
356
65 161
DOR 
 16
93  7
23
Di
Diagnostic
ti odds
dd ratios
ti
S
Sensitivity
ii i
Specificity
50%
60%
70%
80%
90%
95%
99%
50%
1
2
2
4
9
19
99
60%
2
2
4
6
14
29
149
70%
2
4
5
9
21
44
231
80%
4
6
9
16
36
76
396
90%
9
14
21
36
81
8
171
891
89
95%
19
29
44
76
171
361
1881
99%
99
149
231
396
891
1881
9801
24
1
Symmetrical ROC curves and
diagnostic odds ratios
.8
As DOR iincreases,
A
the ROC curve
moves closer to its
ideal position near
the upper-left corner.
Sensitivity
.4
.6
uninformative test
line of symmetry
0
.2
ROC curve iis
asymmetric when
test accuracy varies
with threshold
1
.8
.6
.4
Specificity
DOR = 90
DOR = 6
.2
0
DOR = 15
DOR = 3
25
Statistical modelling of ROC curves

statisticians like straight lines with axes that are
independent variables

first calculate the logits of TPR and FPR

and then graph the difference against their sum
26
Translating ROC space to D versus S
1.0
6
D = log od
dds ratio
True positivve rate
T
0.8
0.6
0.4
5
4
3
2
1
0.2
0
0.0
-1
1
0.0
0.2
0.4
0.6
False positive rate
0.8
1.0
-6 -5 -4 -3 -2 -1
S
0
127 2
Moses-Littenberg SROC method
What do the axes mean?
 Difference in logits is
the log of the DOR
 Sum of the logits is a
marker of diagnostic
threshold
D = log odds ratio
6
5
4
3
2
1
0
-1
-6 -5 -4 -3 -2 -1
S
0
1
2
28
Moses-Littenberg SROC method

Regression models can be used to fit the straight
lines to model relationship between test accuracy
and test threshold
D = a + bS



Outcome variable D is the difference in the logits
Explanatory variable S is the sum of the logits
Ordinary or weighted regression – weighted by sample
size or by inverse variance of the log of the DOR
29
Li
Linear
R
Regression
i
6
5
4
D
3
2
1
0
-1
-6
-5
-4
-3
-2
-1
0
1
2
S
30
Producing summary ROC curves

Transform back to the ROC dimensions

where ‘a’ is the intercept, ‘b’ is the slope
 when the ROC curve is symmetrical, b=0 and
the equation is simpler
31
Linear Regression & Back Transformation
1.0
6
5
Q
0.8
True
e positive rate
4
D
3
2
1
0
0.6
0.4
0.2
-1
-6
-5
-4
-3
-2
S
-1
0
1
2
0.0
0.0
0.2
0.4
0.6
False positive rate
0.832
1.0
Diff
Different
t situations
it ti


What is the relationship between
the underlying
y g distribution and the
ROC curve and the D versus S line?
Let’s have a look at different
situations.
33
ROC curve and logit difference and sum
plot: small difference, same spread
re
elative frequenc
cy
0.1
0 08 non-diseased
0.08
diseased
0.06
0.04
0.02
0
0
20
40
60
80
100
100
10
80
60
40
20
0
0
20
40
60
80
false positive rate (%age)
100
logit TPR - log
git FPR
true positiive rate (%age
e)
measurement
6
2
-40
-20
0
20
40
-2
logit TPR + logit FPR
34
rela
ative frequenc
cy
ROC curve and logit difference and sum plot:
moderate difference, same spread
0.1
0.08
0.06
0.04
0.02
0
diseased
non diseased
non-diseased
0
20
40
60
80
100
measurement
100
logit TPR - llogit FPR
60
(%ag
ge)
true positiive rate
80
40
20
0
0
20
40
60
80
100
false positive rate (%age)
8
4
0
-30
-20
-10
0
10
20
30
40
-4
l it TPR + logit
logit
l it FPR
35
relatiive frequency
y
ROC curve and logit difference and sum plot:
large difference, same spread
01
0.1
non-diseased
0.08
0.06
diseased
0.04
0.02
0
0
20
40
60
80
100
measurement
8
80
logit TPR - llogit FPR
true positive
e rate (%age))
100
60
40
20
0
0
20
40
60
80
100
4
0
-30
-20
-10
0
10
20
30
40
-4
l it TPR + logit
logit
l it FPR
false positive rate (%age)
36
ROC curve and logit difference and sum plot:
moderate difference, unequal spread
relatiive frequency
0.1
0.08
0.06
non-diseased
diseased
0.04
0.02
0
0
20
40
60
80
100
100
10
80
LOW
DOR
60
40
20
logit tpr - llogit fpr
HIGH
DOR
trrue positive ra
ate (%age)
measurement
8
6
4
2
0
-30
-20
-10
-2 0
10
20
30
-4
-6
0
0
20
40
60
80
100
logit tpr + logit fpr
false positive rate (%age)
37
SROC regression:
g
another example
p
10
1.0
Sensitivity
7
0.8
6
0.6
5
unweighted
4
0.4
weighted
3
0.2
2
0.0
1
0.0
0.2
0.4
0.6
1 - Specificity
0.8
1.0
-4
-3
3
-2
-1
0
1
2
S
Transformation linearizes relationship between
accuracy and threshold so that linear regression
can be used
38
PSV example cont.
1.0
7
6
0.8
unweighted
4
weighted
3
Sensitivity
5
0.6
0.4
0.2
2
0.0
1
-4
-3
-2
-1
0
1
2
0.0
S
0.2
0.4
0.6
0.8
1.0
1 - Specificity
inverse transformation
The SROC curve is produced by using the estimates of a and b to compute
the expected sensitivity (tpr) across a range of values for 1-specificity (fpr)
39
Problems with the Moses-Littenberg
SROC method

Poor estimation



Tends to underestimate test accuracy due to zero-cell
corrections and bias in weights
Validity of significance tests

Sampling variability in individual studies not properly taken
i t accountt
into

P-values and confidence intervals erroneous
O
Operating
ti points
i t

knowing average sensitivity/specificity is important but
cannot be obtained

Sensitivity for a given specificity can be estimated
40
Advanced models –
HSROC and Bivariate methods

Hierarchical / multi
multi-level
level


Logistic



correctly
y models sampling
p g uncertainty
y in the true p
positive
proportion and the false positive proportion
no zero cell adjustments needed
R d
Random
effects
ff t


allows for both within and between study variability, and
within study correlations between diseased and nondiseased groups
allows for heterogeneity between studies
Regression models

used to investigate sources of heterogeneity
41
Parameterizations

HSROC





Mean lnDOR
Variance lnDOR
Mean threshold
Variance threshold
Shape of ROC

Bivariate





Mean logit sens
Variance logit sens
Mean logit spec
Variance logit
g spec
p
Correlation between
sensitivity and
specificity
Other than the p
parameterization,, the models are mathematicallyy equivalent,
q
, see
Harbord R, Deeks J et al. A unification of models for meta-analysis of diagnostic
accuracy studies. Biostatistics 2006;1:1-21.
42
Hierarchical SROC model
1
accuracy
threshold
Senssitivity
shape
.5
0
1
.5
Specificity
0
43
Bivariate model
Senssitivity
1
correlation
specificity
sensitivity
.5
0
1
.5
Specificity
0
44
Outputs from the models
HSROC
 Estimates underlying SROC
curve, and the average
operating point on the curve
(
(mean
DOR and
d mean
threshold)


Possible to estimate mean
sensitivity, specificity and
mean likelihood ratios, with
standard errors obtained
using the delta method
Confidence and prediction
ellipses estimable
Bivariate
 Estimates the average
operating point (mean
sensitivity and specificity),
confidence and prediction
ellipses

Possible to estimate mean
likelihood ratios, with
standard errors obtained
using the delta method

Underlying SROC curve
estimable
45
Fitting the models
HSROC



Hierarchical model
with non-linear
regression, random
effects and binomial
error
Original code in
winBUGs
Easy to fit in PROC
NLMIXED in SAS
Bi i t
Bivariate



Hierarchical model with
linear regression,
g
,
random effects and
binomial error
Easy to fit in PROC
NLMIXED in SAS, can
be fitted in PROC
MIXED
Also in GLLAMM in
STATA, MLWin
46
Syntax Proc NLMIXED - HSROC
proc nlmixed
l i dd
data=diag
t di
;
parms alpha=4 theta=0 beta=0
s2ua=1
s2ua
1 s2ut=1;
s2ut 1;
logitp = (theta + ut + (alpha + ua) * dis) *
exp(-(beta)*dis);
p = exp(logitp)/(1+exp(logitp));
model pos ~ binomial(n,p);
random ua ut ~ normal([0 , 0],
0]
[s2ua,0,s2ut]) subject=study;
shape
Disease
indicator
47
Hierarchical SROC model
1
accuracy
threshold
Senssitivity
shape
.5
0
1
.5
Specificity
0
48
Syntax Proc NLMIXED - Bivariate
proc nlmixed
l i dd
data=diag
t di
;
parms msens=1 mspec=2
s2usens=0.2
s2usens
0.2 s2uspec=0.6
s2uspec 0.6 cov=0;
cov 0;
logitp = (msens + usens)*dis +
(mspec + uspec)*nondis;
p = exp(logitp)/(1+exp(logitp));
model pos ~ binomial(n,p);
random usens uspec ~ normal([0 , 0],
0]
[s2usens,cov,s2uspec]) subject=study;
49
Bivariate model
Senssitivity
1
correlation
specificity
sensitivity
.5
0
1
.5
Specificity
0
50
METADAS

SAS macro developed to automate
HSROC/bivariate analysis using PROC
NLMIXED

Can b
C
be used
d ttogether
th with
ith R
Review
i
Manager 5 (Cochrane review Software):



Plot summary curve(s)
Display summary point(s)
Display 95% confidence and/or prediction
regions for summary point(s)
51
P t2
Part
dealing with heterogeneity
The meta-analyst
meta-analyst’ss dream!
1,00
0 90
0,90
0,80
s
e
n
s
i
t
i
v
i
t
y
0 70
0,70
0,60
0,50
,
0,40
0,30
0,20
0,10
0,00
0,00
0,20
0,40
0,60
1-specificity
0,80
1,00
53
Realistic situation: vast heterogeneity
54
Echocardiography in Coronary Heart Disease
1.0
Se
ensitivity
y
0.8
0.6
04
0.4
0.2
0.0
00
0.0
02
0.2
04
0.4
06
0.6
08
0.8
10
1.0
1-specificity
55
GLAL in Gram Negative Sepsis
1.0
Se
ensitivity
0.8
0.6
04
0.4
0.2
0.0
00
0.0
02
0.2
04
0.4
06
0.6
08
0.8
10
1.0
1-specificity
56
F/T PSA in the Detection of Prostate cancer
1.0
S
Sensitivit
ty
0.8
0.6
0.4
0.2
0.0
00
0.0
02
0.2
04
0.4
06
0.6
08
0.8
10
1.0
1-specificity
57
Dip-stick Testing for Urinary Tract Infection
1.0
Sensitivity
0.8
06
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1-specificity
58
S
Sources
off Variation
V i ti
I.
Chance
I.
variation
II.
Differences
II.
in threshold
III
III.
Bias
III
III.
IV.
IV.
Clinical
subgroups
V.
Unexplained
V.
variation
59
Sources of Variation: Chance
Chance variability:
sample
l size=100
100
1.0
1.0
0.8
0.8
Sensitivityy
Sensitivityy
Chance variability:
sample
l size=40
40
0.6
0.4
0.6
0.4
0.2
0.2
0.0
00
0.0
1.0
0.8
0.6
0.4
Specificity
0.2
0.0
1.0
0.8
0.6
0.4
Specificity
0.2
0.0
60
Sources of Variation: Threshold
1.0
Threshold:
 perfect negative
correlation
 no chance variability
Sen
nsitivity
0.8
0.6
0.4
0.2
0.0
1.0
0.8
0.6
0.4
Specificity
0.2
0.0
61
Sources of Variation: Threshold
1.0
Threshold:
Th
h ld
 perfect negative
correlation
 + chance variability
ss=60
Sen
nsitivity
0.8
0.6
0.4
0.2
0.0
1.0
0.8
0.6
0.4
Specificity
0.2
0.0
62
Sources of Variation: Bias & Subgroup
1.0
Bias & Subgroup:
 sens & spec higher
 ss=60
 no threshold
Sen
nsitivity
0.8
0.6
0.4
0.2
0.0
1.0
0.8
0.6
0.4
Specificity
0.2
0.0
63
S
Sources
off Variation
V i ti
I.
Chance variation
II.
Differences in threshold
III.
Bias
IV.
S b
Subgroups
V.
Unexplained variation
64
Comparison
Feature
Older
Model*
Advanced
models**
+/-
+
Threshold differences
+
+
Subgroup
+
+
+/-
+
Chance variability
Unexplained variation
* Moses-Littenberg model
** Hierarchical and bivariate models
65
E l i h
Exploring
heterogeneity
t
it
S
Summarise
i data
d
per subgroup
b


Subgroup analyses
Meta-regression
Meta
regression analysis
Covariates


Study characteristics (patients, index tests,
reference standard, setting, disease stage, etc.)
Methodological quality items (QUADAS items)
66
Subgroup analysis and metaregression

Advanced models can easily incorporate study
studylevel covariates

Different questions can be addressed:
 differences in summary points of sensitivity or
specificity
ifi it
 differences in overall accuracy
 differences in threshold
 differences in shape of SROC curve
67
Limitations of meta-regression

Validity
V
lidit off covariate
i t iinformation
f
ti
 poor reporting on design features

Population characteristics
 information missing or crudely available

Lack of power
 small number of contrasting studies
68
Subgroup analyses
Subgroup 1:
 both sens
& spec higher
1.0
Sensitivity
0.8
0.6
0.4
0.2
0
0.0
1.0
0.8
0.6
0.4
Specificity
0.2
0.0
69
Prospective vs. Retrospective studies
1.0
S
Sensitivity
y
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1-specificity
Data collection:
Prosp
Retro
70
Thi may llook
This
k easy, b
but…
t

The following slides give the results of a
study we did to incorporate the effects of
quality into a meta-analysis.
Leeflang et al.
al Impact of adjustment for quality on results of
metaanalyses of diagnostic accuracy. Clin Chem. 2007;53:164-72.
71
Eff t off high/low
Effects
hi h/l
Q?
1.
2.
3.
Change in DOR
Change in consistency of DOR
Change in heterogeneity
72
H
Hypotheses
th
Deficiencies in study quality have been associated
with inflated estimates and with heterogeneity.
g
y
Accounting for quality differences will therefore
lead to …

… less optimistic summary estimates
estimates.

… more homogenous results.
73
Challenge 3
I
Incorporation
ti Strategies
St t i
1.
I
Ignoring
i
(sometimes
(
ti
graphs
h are shown)
h
)
pooling all studies, disregarding quality
2.
Subgroup
g
p Analysis
y
3.
Regression analysis
4.
also: quality as criterion for inclusion
also: stratification  more than one subgroup
also: sensitivity analysis
Stepwise multivariable regression analysis and
Multivariable regression analysis with a fixed set of
covariates
Weighted pooling
‘not done’
5.
Sequential analysis
highest quality    lowest quality
cumulative meta-analysis
74
M th d
Methods

Quality
Q
lit assessmentt in
i 487 studies
t di iincluded
l d d iin 30 systematic
t
ti
reviews.

QUADAS checklist used

Two definitions for high-quality:
1.
2.

Evidence-based definition
Common p
practice definition
Three methods for incorporation of quality:
1.
2
2.
3.

(Whiting et al. BMC Med Res Methodol, 2003)
Exclusion of low quality studies
Multivariable regression analysis with all items involved
Stepwise multivariable regression analysis (p>0.2)
Comparison of DORs, 95% CI of DORs, and changes in a
hypothetical decision.
decision
75
Evidence-based definition
76
C
Common
practice
ti d
definition
fi iti
77
R
Results
lt

Nonreporting of items was common, especially for
blinding of index or reference test; time-interval between
index test and reference test; and about inclusion of
patients.
patients

Evidence-based definition: 72 high quality studies (15%);
12 reviews contained no high-quality studies.

Common-practice definition: 70 high quality studies
(14%); 9 reviews contained no high-quality studies.

Fulfilling all 8 criteria: only 10 out of 487 studies were of
high quality and only 1 meta-analysis out of 31 contained
more than 3 high-quality studies…
78
Th St
The
Strategies
t i

Ignoring quality:
Pooling all studies
■
Analyzing
s bg o ps
subgroups:
Only pooling high-quality studies;
high q
quality
alit defined as fulfilling
f lfilling a
certain subset of criteria.
Stepwise
multivariable
regression
analysis:
QUADAS-items with a p-value <0.2
univariate are entered in a
multivariable regression model
▲

Multivariable
A standard set of three QUADASregression
items was used as covariates in
analysis with a
each meta
meta-analysis
analysis.
set of covariates:
79
ID MA
DOR
80
C
Conclusions?
l i
?
We found no evidence for our hypothesis that adjusting for
quality leads to less optimistic and more homogenous
results.
Explanations:
Poor reporting
Small sample size (30 SRs, small studies)
Opposite effects of quality items
DOR in stead of sensitivity and specificity
Relation quality – estimates not straightforward
Still, poor quality will affect the trustworthiness. Therefore,
report quality of individual studies and overall quality.
81
E
Exercise
i


What do the results of a metaanalysis
l
mean…?
I have some Output from SAS and
STATA and would like to invite you
to have
h
a look
l k at them.
h
82
Bivariate or HSROC?
What do the parameters mean?
83
84
Part 3
Test Comparisons
Differences between tests

Diagnosis of lymph node metastasis in women with cervical
cancer

2 imaging modalities:
 lymphangiography (LAG, n=17)
 CT (n=17)

Published meta-analysis
meta analysis JAMA 1997;278:1096
1997;278:1096-1101
1101

Modelled by adding covariate for test into the model
statement,
state
e t, a
and
d pa
parameter
a ete est
estimates
ates for
o d
differences
e e ces in:


Sensitivity and specificity for bivariate
Log DOR, threshold and shape for HSROC
86
ROC plot of individual study results
(L=lymphangiography C=CT)
1.0
L
Senssitivity
08
0.8
L
L
LC
L
CC
CLL C C
L L
L C
0.6
C
L
L
LL
L
C L
0.4
0.2
CC
C
CC
C
L
CC
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1-specificity
87
S
Summary
ROC estimates
ti t
1.0
CT
L
LL
True positive rate
08
0.8
0.6
LAG
L
L
LC
C
CLL
CLL C C
L L
L C
L L
C
LL
L
L
L
L
C
0.4
CC
C
02
0.2
LC C
CL
L
L
L
C
L C
0.0
0.0
0.2
0.4
0.6
False positive rate
0.8
1.0
88
Average operating points and
confidence ellipses
1.0
L
L
Sens
sitivity
08
0.8
LAG L
L
LC
L
C
C
CLL C C
L L
L C
0.6
C
CT
L
LL
L
C
0.4
L
CC
C
0.2
CC
C
L
C C
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1-specificity
89
Difference between average operating
points
Imaging modality
Sensitivity (95% CI)
Specificity (95% CI)
LAG
0.67 (0.57 to 0.76)
0.80 (0.73 to 0.85)
CT
0 49 (0
0.49
(0.37
37 to 0
0.61)
61)
0 92 (0
0.92
(0.88
88 to 0
0.95)
95)
0.023
0.0002
P-value Lag vs. CT
90
Summary points or SROC curves?

Clinical interpretation


Need to estimate performance at a threshold, using
sensitivity, specificity or/and likelihood ratios
Single threshold or mixed thresholds?


Summary curve describes how test performance varies
across thresholds. Studies do not need to report a
common threshold to contribute.
Summary point must relate to a particular threshold.
Only studies reporting a common threshold can be
combined.
91
Summary points or SROC curves?

Comparing tests and subgroups




Often wish to use as much data as possible –
 if this means mixing thresholds SROC curves are
needed
d d
 if still a common threshold either method appropriate
Possible to assess impact of threshold as a covariate
SROC curves allow identification of crossing lines
A Cochrane review may include both an analysis of the
SROC curves, and estimation of average threshold specific
operating points
92
C
Comparative
ti analyses
l


Indirect comparisons

Different tests used in different studies

Potentially
P
i ll confounded
f
d db
by other
h diff
differences b
between the
h
studies
Direct comparisons

Patients receive both tests or randomized to tests

Diff
Differences
in
i accuracy more attributable
tt ib t bl to
t the
th tests
t t

Few studies may be available and may not be
representative
93
Example of pilot Cochrane Review
Down’ Syndrome screening review
Studies
Participants
1st trimester - NT alone
10
79,412
1st trimester - NT and serology
22
222 171
222,171
2nd trimester - triple test (serology)
19
72,797
94
95
Indirect comparison
NT alone
Sensitivity: 72% (63%-79%)
Specificity: 94% (91% -96%)
DOR: 39 (26-60)
NT with serology
Sensitivity: 86% (82%-90%)
Specificity: 95% (93%
(93%-96%)
96%)
DOR: 110 (84-143)
RDOR: 2.8 (1.7-4.6),
p <0.0001
Triple test
S
Sensitivity:
iti it 82% (76%
(76%-86%)
86%)
Specificity: 83% (77%-87%)
DOR: 21 ((15-30))
RDOR: 0.5 (0.3-0.9),
p = 0.03 96
DIRECT COMPARISONS
NT alone
Sensitivity: 71% (59%-82%)
Specificity: 95% (91%-98%)
DOR: 41 (16-67)
NT with serology
Sensitivity: 85% (77%-93%)
Specificity: 96% (93%-98%)
DOR: 123 (40-206)
Triple test
No paired studies available
97
I di t versus Direct
Indirect
Di t comparisons
i
NT alone
NT alone
Sensitivity: 72% (63%-79%)
Sensitivity: 71% (59%-82%)
Specificity: 94% (91% -96%)
96%)
Specificity: 95% (91%
(91%-98%)
98%)
DOR: 39 (26-60)
DOR: 41 (16-67)
NT with serology
NT with serology
Sensitivity: 86% (82%-90%)
Sensitivity: 85% (77%-93%)
S
Specificity:
f
9 % (93%
95%
(93%-96%)
96%)
Specificity: 96% (93%-98%)
DOR: 110 (84-143)
DOR: 123 (40-206)
RDOR: 2.8 (1.7
(1.7-4.6),
4.6),
p <0.0001
98
Part 4
Some other issues
A th approach…
Another
h


Hypothesis testing is not common in
diagnostic test accuracy research or
in diagnostic meta-analyses.
But you could test whether the
studies
stud
es you found
ou d or
o whether
et e the
t e
summary estimate falls within a
certain target region.
100
T
Target
t region
i
100
True positive
e rate (sens)
Target region
80
60
40
20
0
0
20
40
60
80
False positive rate (1-spec)
100
101
True positive
e rate (sens)
100
Target region
80
60
40
20
0
0
20
40
60
80
False positive rate (1-spec)
100
102
P bli ti bi
Publication
bias



In systematic reviews of intervention
studies, publication bias is an important
form of bias
To investigate
g
p
publication bias in reviews,,
funnel plots are used.
IIn diagnostic
di
ti reviews,
i
ffunnell plots
l t are
seriously misleading and alternatives
have poor power.
103
P bli ti bi
Publication
bias - background
b k
d



many studies are done without ethical review or
study registration  prospective registration is
therefore not available
diagnostic test accuracy studies do not test
hypotheses so there is no ‘significance’
hypotheses,
significance involved
we have no clue whether publication bias exists
fo diagnostic acc
for
accuracy
ac st
studies
dies and how
ho the
mechanisms behind it may work
104
Summary

Part 1: meta-analysis introduction

P
Part
2
2: heterogeneity
h
i

Part 3: test comparisons

Part 4: some other issues
105
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising