forked from MathiasHarrer/Doing-Meta-Analysis-in-R
-
Notifications
You must be signed in to change notification settings - Fork 0
/
05-heterogeneity.Rmd
521 lines (338 loc) · 33.1 KB
/
05-heterogeneity.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
# Between-study Heterogeneity {#heterogeneity}
![](_figs/schiffchen.jpg)
By now, we have already shown you how to pool effect sizes in a meta-analysis. In meta-analytic pooling, we aim to **synthesize the effects of many different studies into one single effect**. However, this makes only sense if we aren't comparing **Apples and Oranges**. For example, it could be the case that while the overall effect we calculate in the meta-analysis is **small**, there are still a few studies which report **very high** effect sizes. Such information is lost in the aggregate effect, but it is very important to know if all studies, or interventions, yield small effect sizes, or if there are exceptions.
It could also be the case that even some very **extreme effect sizes** were included in the meta-analysis, so-called **outliers**. Such outliers might have even distorted our overall effect, and it is important to know how our overall effect would have looked without them.
The extent to which effect sizes vary within a meta-analysis is called **heterogeneity**. It is very important to assess heterogeneity in meta-analyses, as high heterogeneity could be caused by the fact that there are actually two or more **subgroups** of studies present in the data, which have a different true effect. Such information could be very valuable for **research**, because this might allow us to find certain interventions or populations for which effects are lower or higher. From a statistical standpoint, high heterogeneity is also problematic. Very high heterogeneity could mean that the studies have nothing in common, and that there is no **"real" true effect behind our data**, meaning that it makes no sense to report the pooled effect at all [@borenstein2011].
```{block,type='rmdinfo'}
**The idea behind heterogeneity**
Rücker and colleagues [@rucker2008undue] name three types of heterogeneity in meta-analyses:
1. *Clinical baseline heterogeneity*. These are differences between sample characteristics of the studies. For example, while one study might have included rather old people into their study, another might have recruited study participants who were mostly quite young.
2. *Statistical heterogeneity*. This is the statistical heterogeneity we find in our collected effect size data. Such heterogeneity might be either important from a clinical standpoint (e.g., when we do not know if a treatment is very or only marginally effective because the effects vary much from study to study), or from statistical standpoint (because it dilutes the confidence we have in our pooled effect)
3. *Other sources of heterogeneity*, such as design-related heterogeneity.
Point 1. and 3. may be controlled for to some extent by restricting the scope of our search for studies to certain well-defined intervention types, populations, and outcomes.
Point 2., on the other hand, has to be assessed once we conducted the pooling of studies. This is what this chapter focuses on.
```
```{block,type='rmdinfo'}
**Heterogeneity Measures**
There are **three types of heterogeneity measures** which are commonly used to assess the degree of heterogeneity. In the following examples, $k$ denotes the individual study, $K$ denotes all studies in our meta-analysis, $\hat \theta_k$ is the estimated effect of $k$ with a variance of $\hat \sigma^{2}_k$, and $w_k$ is the individual **weight** of the study (i.e., its *inverse variance*: $w_k = \frac{1}{\hat \sigma^{2}_k}$; see infobox in [Chapter 4.1.1](#fixed) for more details).
**1. Cochran's *Q* **
Cochran's *Q*-statistic is the **difference between the observed effect sizes and the fixed-effect model estimate** of the effect size, which is then **squared, weighted and summed**.
$$ Q = \sum\limits_{k=1}^K w_k (\hat\theta_k - \frac{\sum\limits_{k=1}^K w_k \hat\theta_k}{\sum\limits_{k=1}^K w_k})^{2}$$
**2. Higgin's & Thompson's *I*^2^ **
$I^{2}$ [@higgins2002quantifying] is the **percentage of variability** in the effect sizes which is not caused by sampling error. It is derived from $Q$:
$$I^{2} = max \left\{0, \frac{Q-(K-1)}{Q} \right\}$$
**3. Tau-squared**
$\tau^{2}$ is the between-study variance in our meta-analysis. As we show in [Chapter 4.2.1](#tau2), there are various proposed ways to calculate $\tau^{2}$
```
```{block, type='rmdachtung'}
**Which measure should i use?**
Generally, when we assess and report heterogeneity in a meta-analysis, we need a measure which is **robust, and not too easily influenced by statistical power**. **Cochran's *Q* ** increases both when the **number of studies** ($k$) increases, and when the **precision** (i.e., the sample size $N$ of a study) increases. Therefore, $Q$ and whether it is **significant** highly depends on the size of your meta-analysis, and thus its statistical power. We should therefore not only rely on $Q$ when assessing heterogeneity.
$I^2$ on the other hand, is not sensitive to changes in the number of studies in the analyses. $I^2$ is therefore used extensively in medical and psychological research, especially since there is a **"rule of thumb"** to interpret it [@higgins2003measuring]:
* *I*^2^ = 25%: **low heterogeneity**
* *I*^2^ = 50%: **moderate heterogeneity**
* *I*^2^ = 75%: **substantial heterogeneity**
Despite its common use in the literature, $I^2$ not always an adequate measure for heterogeneity either, because it still heavily depends on the **precision** of the included studies [@rucker2008undue; @borenstein2017basics]. As said before, $I^{2}$ is simply the amount of variability **not caused by sampling error**. If our studies become increasingly large, this sampling error tends to **zero**, while at the same time, $I^{2}$ tends to 100% simply because the single studies have a greater $N$. Only relying on $I^2$ is therefore not a good option either.
**Tau-squared**, on the other hand, is **insensitive** to the number of studies, **and** the precision. Yet, it is often hard to interpret how relevant our tau-squared is from a practical standpoint.
**Prediction intervals** (like the ones we automatically calculated in [Chapter 4](#pool)) are a good way to overcome this limitation [@inthout2016plea], as they take our between-study variance into account. Prediction intervals give us a range for which we can **expect the effects of future studies to fall** based on **our present evidence in the meta-analysis**. If our prediction interval, for example, lies completely on the "positive" side favoring the intervention, we can be quite confident to say that **despite varying effects, the intervention might be at least in some way beneficial in all contexts we studied in the future**. If the confidence interval includes **zero**, we can be less sure about this, although it should be noted that **broad prediction intervals are quite common, especially in medicine and psychology**.
```
<br><br>
---
## Assessing the heterogeneity of your pooled effect size
Thankfully, once you've already pooled your effects in meta-analysis using the `metagen()`, `metabin()`, or `metacont()` function, it is very easy and straightforward to retrieve the **three most common heterogeneity measures** that we described before.
In [Chapter 4.2.2](#random.precalc), we already showed you how to conduct a **random-effects model meta-analysis**. In this example, we stored our **results** in the object `m.hksj`, which we will use again here.
```{r, echo=FALSE,warning=FALSE,message=FALSE}
library(tidyverse)
library(knitr)
library(meta)
library(metafor)
load("_data/Meta_Analysis_Data.RData")
madata<-Meta_Analysis_Data
load("_data/metacont_data.RData")
metacont$Ne<-as.numeric(metacont$Ne)
metacont$Me<-as.numeric(metacont$Me)
metacont$Se<-as.numeric(metacont$Se)
metacont$Mc<-as.numeric(metacont$Mc)
metacont$Sc<-as.numeric(metacont$Sc)
m.hksj<-metagen(TE,
seTE,
data=madata,
studlab=paste(Author),
comb.fixed = FALSE,
comb.random = TRUE,
method.tau = "SJ",
hakn = TRUE,
prediction=TRUE,
sm="SMD")
m.hksj.raw<-metacont(Ne,
Me,
Se,
Nc,
Mc,
Sc,
data=metacont,
studlab=paste(Author),
comb.fixed = FALSE,
comb.random = TRUE,
method.tau = "SJ",
hakn = TRUE,
prediction=TRUE,
sm="SMD")
```
One way to get heterogeneity measures of my meta-analysis is to **print** the meta-analysis (in my case, `m.hksj`) output again.
```{r}
print(m.hksj)
```
We see that this output **already provides us with all three heterogeneity measures** (and even one more, *H*, which we will not cover here).
* $\tau^{2}$, as we can see from the `tau^2` output, is **0.1337**.
* $I^{2}$ is printed next to `I^2`, and has the value **62.6%**, and a 95% confidence interval ranging from 37.9% to 77.5%.
* The value of $Q$ is displayed next to `Q` under `Test of heterogeneity:`.
As we can see, the value is **45.50**. In our case, this is highly significant ($p=0.0002$; see `p-value`).
* The **prediction interval** can be found next to `Prediction interval`. As we can see, the 95% interval ranges from **g=-0.21** to **1.40**.
How can we interpret the values of this example analysis? Well, all three of our indicators suggest that **moderate to substantial heterogeneity is present in our data**. Given the **broad prediction interval**, which stretches well below zero, we also cannot be overly confident that the positive effect we found for our interventions is robust in every context. It might be very well possible that the intervention does not yield positive effects in some future scenarios; even a small negative effect might be possible based on the evidence the meta-analysis gives us. Very high effect sizes, on the other hand, are possible too.
<br><br>
**When the measures are not displayed in my output**
Depending on how you changed the settings of the `metagen`, `metabin`, or `metacont`, it is possible that some of the measures are not displayed in your output. That's not a big deal, because all measures are stored in the object, no matter if they are immediately displayed or not.
To directly access one of the measures, we can use the `$` operator again (see [Chapter 3.3.1](#convertfactors)). We use this **in combination with our meta-analysis output object** to define which measure we want to see.
```{r,echo=FALSE}
library(kableExtra)
Code<-c("$Q","$pval.Q","$I2","$lower.I2","$upper.I2","$tau^2","$lower.predict","$upper.predict")
Measure<-c("Cochran's Q","The p-value for Cochran's Q","I-squared","The lower bound of the I-squared 95%CI","The upper bound of the I-squared 95%CI","Tau-squared","The lower bound of the 95% prediction interval","The upper bound of the 95% prediction interval")
m<-data.frame(Code,Measure)
names<-c("Code","Measure")
colnames(m)<-names
kable(m)
```
Here are a few examples for my `m.hksj` object. As you'll see, the output is **identical** to the one before.
```{r}
m.hksj$Q
```
```{r}
m.hksj$I2
```
```{r}
m.hksj$tau^2
```
<br><br>
---
## Detecting outliers & influential cases
As mentioned before, **between-study heterogeneity** can also be caused by one more studies with **extreme effect sizes** which do not quite **fit in**. Especially when the **quality of these studies is low**, or the **studies are very small**, this may **distort** our pooled effect estimate, and it's a good idea to have a look on the **pooled effect again once we remove such outliers from the analysis**.
On the other hand, we also want to know **if the pooled effect estimate we found is robust**, meaning that the effect does not depend heavily on **one single study**. Therefore, we also want to know **whether there are studies which heavily push the effect of our analysis into one direction**. Such studies are called **influential cases**, and we'll devote some time to this topic in the [second part](#influenceanalyses) of this chapter.
```{block,type='rmdachtung'}
It should be noted that they are **many methods** to spot **outliers and influential cases**, and the methods described here are not comprehensive. If you want to read more about the underpinnings of this topic, we can recommend the paper by Wolfgang Viechtbauer and Mike Cheung [@viechtbauer2010outlier].
```
### Searching for extreme effect sizes (outliers) {#outliers}
A common method to detect outliers directly is to define a study as an outlier if the **study's confidence interval does not overlap with the confidence interval of the pooled effect**. This means that we define a study as an outlier when its effect size estimate is **so extreme that we have high certainty that the study cannot be part of the "population" of effect sizes we actually pool in our meta-analysis** (i.e., the individual study differs significantly from the overall effect). To detect such outliers in our dataset, we can search for all studies:
* for which the **upper bound of the 95% confidence interval is lower than the lower bound of the pooled effect confidence interval** (i.e., extremely small effects)
* for which the **lower bound of the 95% confidence interval is higher than the upper bound of the pooled effect confidence interval** (i.e., extremely large effects)
Here, I will use my `m.hksj` meta-analysis output from [Chapter 4.2.2 ](#random.precalc) again. Let us see what the **upper and lower bound of my pooled effect confidence interval** is. As I performed a **random-effect meta-analysis in this example**, I will use the value stored under `$lower.random` and `$upper.random`. If you performed a **fixed-effect meta-analysis**, the objects would be `$lower.fixed` and `$upper.fixed`, respectively.
```{r}
m.hksj$lower.random
m.hksj$upper.random
```
Here we go. I now see that my **pooled effect confidence interval** stretches from $g = 0.389$ to $g = 0.798$. We can use these values to filter out outliers now.
To do this, we have prepared a **function** called `find.outliers` for you. The function is part of the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it into your library first.
```{r, message = FALSE, warning = FALSE}
library(dmetar)
```
If you do not want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/find.outliers.R). In this case, *R* doesn't know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** in the bottom left pane of RStudio, and then hit **Enter ⏎**. The function then requires the `meta` and `metafor` package to work.
The only thing we have to provide the `find.outliers` function with is the **meta-analysis object** that we want to check for outliers. In my case, this is `m.hksj`.
```{r,eval=FALSE, message=FALSE, warning=FALSE}
find.outliers(m.hksj)
```
This is the output we get from the function:
```{r,echo=FALSE}
find.outliers(m.hksj)
```
We see that the function has detected **two outliers**, "DanitzOrsillo" and "Shapiro". Conveniently, the `find.outliers` function has also automatically rerun our initial analysis, this time excluding the identified outliers. From the output, we see that the $I^2$-heterogeneity shrinks considerably when the two studies are excluded, from $I^2 = 62\%$ to $24.8\%$, which is not significant anymore ($p = 0.1739$).
We can also produce an updated forest plot in which the outliers are excluded by plugging the results of the `find.outliers` function into the `forest` function (this only works if you have already loaded the `meta` and `metafor` package from your library). The appearance of the resulting forest plot can be changed using arguments of the `forest` function in `meta` (these arguments are covered in detail in [Chapter 5](#forest)).
```{r, eval=F}
fo <- find.outliers(m.hksj)
forest(fo, col.predict = "blue")
```
```{r, fig.width=9, fig.height=6, echo=F, message=F}
library(meta)
library(metafor)
fo <- find.outliers(m.hksj)
dmetar:::forest.find.outliers(fo, col.predict = "blue")
```
In the resulting forest plot, the outlying studies are still displayed. However, their **weight** in the meta-analysis has been set to 0\% (as shown in the column to the right), meaning that they are excluded from pooling.
<br></br>
---
## Influence Analyses {#influenceanalyses}
We have now showed you how you can detect and remove **extreme effect sizes** (outliers) in your meta-analysis. As we have mentioned before, however, it is not only statistical outliers which may cause concerns regarding the robustness of our pooled effect. It is also possible that **some studies in a meta-analysis exert a very high influence on our overall results**. For example, it could be the case that we find that an overall effect is not significant, when in fact, a highly significant effect is consistently found once we remove one particular study in our analysis. Such information is **highly important once we want to communicate the results of our meta-analysis to the public**.
Here, we present techniques which dig a little deeper than simple outlier removal. These techniques are based on the **Leave-One-Out**-method, in which we **recalculate the results of our meta-analysis** $K-1$ times, each times leaving out one study. This way, we can more easily detect **studies which influence the overall estimate of our meta-analysis the most**, and this lets us better assess if this **influence may distort our pooled effect** [@viechtbauer2010outlier]. Thus, such analyses are called **Influence Analyses**. We have created the **function** `InfluenceAnalysis` for you, in which these analyses are conducted and results are visualized all in one. This function is part of the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it into your library first.
```{r, eval=FALSE}
library(dmetar)
```
If you do not want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/influence.analysis.R). In this case, *R* doesn't know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** on the bottom left pane of RStudio, and then hit **Enter ⏎**. The function requires the `ggplot2`, `ggrepel`, `forcats`, `dplyr`, `grid`, `gridExtra`, `metafor`, and `meta` package to work.
The `InfluenceAnalysis` function has **several parameters** which we have to define.
```{r,echo=FALSE}
Code<-c("x","random","subplot.heights", "subplot.widths", "forest.lims", "return.separate.plots", "text.scale")
Description<-c("An object of class meta, generated by the metabin, metagen, metacont, metacor, metainc, or metaprop function",
"Logical. Should the random-effects model be used to generate the influence diagnostics? Uses the method.tau specified in the meta object if one of 'DL', 'HE', 'SJ', 'ML', 'REML', 'EB', 'HS' or 'GENQ' (to ensure compatibility with the metafor package). Otherwise, the DerSimonian-Laird ('DL'; DerSimonian & Laird, 1986) estimator is used. FALSE by default.",
"Concatenated array of two numerics. Specifies the heights of the first (first number) and second (second number) row of the overall results plot generated by the function. Default is c(30,18).",
"Concatenated array of two numerics. Specifies the widths of the first (first number) and second (second number) column of the overall results plot generated by the function. Default is c(30,30).",
"Concatenated array of two numerics. Specifies the x-axis limits of the forest plots generated by the function. Use 'default' if standard settings should be used (this is the default).",
"Logical. Should the influence plots be returned as seperate plots in lieue of returning them in one overall plot? Additionally returns a dataframe containing the data used for plotting. If set to TRUE, the ouput of the function must be saved in a variable; specific plots can then be accessed by selecting the plot element and using plot().",
"Positive numeric. Scaling factor for the text geoms used in the plot. Values <1 shrink the text, while values >1 increase the text size. Default is 1." )
m<-data.frame(Code, Description)
names<-c("Code", "Description")
colnames(m)<-names
kable(m)
```
This is how the function code looks for my `m.hksj` data. I will save the output of the function as `inf.analysis`.
```{r, echo=FALSE, message=FALSE, error=FALSE}
library(ggplot2)
library(ggrepel)
library(forcats)
library(dplyr)
library(grid)
library(gridExtra)
library(metafor)
library(meta)
source("dmetar/influence.analysis.R")
```
```{r}
inf.analysis <- InfluenceAnalysis(x = m.hksj,
random = TRUE)
```
Now, let us have a look at the output, using the `summary` function.
```{r}
summary(inf.analysis)
```
We can also conveniently generate the difference influence plots by using `plot`, and specifying the plot we want to generate in the second argument. Let us interpret them one by one.
<br><br>
**Influence Analyses**
```{r, eval=FALSE}
plot(inf.analysis, "influence")
```
```{r,echo=FALSE,fig.align='center',fig.cap="Influence Analyses", fig.height=7}
library(png)
library(grid)
img <- readPNG("_figs/inflA1.png")
grid.raster(img)
```
In the first analysis, you can see different influence measures, for which we can see **graphs including each individual study of our meta-analysis**. This type of **influence analysis** has been proposed by Viechtbauer and Cheung [@viechtbauer2010outlier]. Let us discuss the most important subplots here:
* **dffits**: The DIFFITS value of a study indicates in standard deviations how much the predicted pooled effect changes after excluding this study.
* **cook.d**: The **Cook's distance** resembles the **Mahalanobis distance** you may know from outlier detection in conventional multivariate statistics. It is the distance between the value once the study is included compared to when it is excluded.
* **cov.r**: The **covariance ratio** is the determinant of the variance-covariance matrix of the parameter estimates when the study is removed, divided by the determinant of the variance-covariance matrix of the parameter estimates when the full dataset is considered. Importantly, values of cov.r < 1 indicate that removing the study will lead to a more precise effect size estimation (i.e., less heterogeneity).
Usually, however, you don't have to dig this deep into the calculations of the individual measures. As a rule of thumb, **influential cases** are studies with **very extreme values in the graphs**. Viechtbauer and Cheung have also proposed cut-offs when to define a a study as an influential case, for example (with $p$ being the number of model coefficients and $k$ the number of studies):
$$ DFFITS > 3\times\sqrt{\frac{p}{k-p}}$$
$$ hat > 3\times\frac{p}{k}$$
If a case was determined being **an influential case using these cut-offs**, its value will be displayed in **red** (in our example, this is the case for study "Dan", or Danitz-Orsillo).
```{block, type='rmdachtung'}
Please note, as Viechtbauer & Cheung emphasize, that **these cut-offs are set on somewhat arbitrary thresholds**. Therefore, you should never only have a look on the color of the study, but the general structure of the graph, and interpret results in context.
In our example, we see that while only the study by Danitz-Orsillo is defined as an influential case, there are **actually two spikes in most plots**, while the other studies have quite the same value. Given this structure, we could also decide to define **Study "Sha"** (Shapiro et al.) as an influential case too, because its values are very extreme too.
```
In these analyses, we found that the studies "Danitz-Orsillo" and "Shapiro et al." might be influential. This is an interesting finding, as we **detected the same studies when only looking at statistical outliers**. This further corroborates that these two studies could maybe have distorted our pooled effect estimate, and **might cause parts of the between-group heterogeneity we found in our meta-analysis**.
<br><br>
**Baujat Plot**
```{r, eval=FALSE}
plot(inf.analysis, "baujat")
```
```{r,echo=FALSE, fig.align='center',fig.cap="Baujat Plot",fig.height=6}
library(png)
library(grid)
img <- readPNG("_figs/inflA2.png")
grid.raster(img)
```
The Baujat Plot [@baujat2002graphical] is a diagnostic plot to detect studies **overly contributing to the heterogeneity of a meta-analysis**. The plot shows the contribution of each study to the overall heterogeneity as measured by Cochran's $Q$ on the **horizontal axis**, and its **influence on the pooled effect size** on the vertical axis. As we want to assess heterogeneity and studies contributing to it, all studies **on the right side of the plot are important to look at**, as this means that they cause much of the heterogeneity we observe. **This is even more important when a study contributes much to the overall heterogeneity, while at the same time being not very influential concerning the overall pooled effect** (e.g., because the study had a very small sample size). Therefore, **all studies on the right side of the Baujat plot**, especially in the **lower part**, are important for us.
As you might have already recognized, the only two **studies we find in this region of the plot are the two studies we already detected before** (Danitz & Orsillo, Shapiro et al.). These studies do not have a large impact on the overall results (presumably because they are very small), but they do **add substantially to the heterogeneity we found in the meta-analysis**.
<br><br>
**Leave-One-Out Analyses**
```{r}
plot(inf.analysis, "es")
```
```{r}
plot(inf.analysis, "i2")
```
In these to forest plots, we see the **pooled effect recalculated, with one study omitted each time**. There are two plots, which provide the same data, but are ordered by different values.
The **first plot is ordered by heterogeneity (low to high), as measured by $I^2$**. We see in the plot that the lowest $I*^2$ heterogeneity is reached (as we've seen before) by omitting the studies **Danitz & Orsillo** and **Shapiro et al.**. This again corroborates our finding that these two studies were the main "culprits" for the between-study heterogeneity we found in the meta-analysis.
The **second plot is ordered by effect size (low to high)**. Here, we see how the overall effect estimate changes with one study removed. Again, as the two outlying studies have very high effect sizes, we find that the overall effect is smallest when they are removed.
All in all, the results of our **outlier and influence analysis** in this example point in the **same direction**. The two studies are probably **outliers which may distort the effect size estimate**, as well as its **precision**. We should therefore also conduct and report a **sensitivity analysis in which these studies are excluded**.
<br><br>
## GOSH Plot Analysis
In the previous [subchapter](#influenceanalyses), we explored the robustness of our meta-analysis results using influence analyses and the leave-one-out method. An even more sophisticated way to explore the patterns of effect sizes and heterogeneity in our data are so-called **Graphic Display of Heterogeneity** (**GOSH**) plots [@olkin2012gosh]. For those plots, we fit the same meta-analysis model to **all possible subsets** of our included studies. In constrast to the leave-one-out method, we therefore not only fit $K-1$ models, but all $2^{k-1}$ possible study combinations. This means that creating GOSH plots can become quite computationally intensive when the number of included studies is large. The *R* implementation we cover here therefore only fits a maximum of 1 million randomly selected models.
Once the models are calculated, we can plot them, displaying the **pooled effect size** on the **x-axis** and the **between-study heterogeneity** at the **y-axis**. This allows us to look for specific patterns, for example subclusters with different effect sizes. This would indicate that there is in fact more than one "population" of effect sizes in our data, warranting a subgroup analysis. If the effect sizes in our sample are homogeneous, the GOSH plot should form a **symmetric distribution with one peak**. To generate GOSH plots, we can use the `gosh` function in the `metafor` package. If you have not installed the package yet, install it and then load it from your library.
```{r, message=FALSE, error=FALSE}
library(metafor)
```
I will generate plots for my `m.hksj` meta-analysis object. Before we can generate the plot, we have to **"transform"** this object created by the `meta` package into a `metafor` meta-analysis object which can be used for the `gosh` function. To do this, i can simply use the data stored in `m.hksj`. The function to perform a meta-analysis in `metafor` is called `rma()`. I only have to provide the function with the effect size (`TE`), Standard Error (`seTE`), and between-study heterogeneity estimator (`method.tau`) stored in `m.hksj`. Because I want to replicate the results completely, I also use Knapp-Hartung adjustments in `rma()` by setting `test = "knha"`.
```{r}
m.rma <- rma(yi = m.hksj$TE,
sei = m.hksj$seTE,
method = m.hksj$method.tau,
test = "knha")
```
Please note that if you use the fixed-effect model, you have to set `method = "FE"`.
**We can then use this object to generate the GOSH plot object.** Depending on the number of studies in your analysis, this can take some time, up to a few hours. I save it as `dat.gosh`.
```{r, echo=FALSE}
load("_data/dat.gosh.rda")
```
```{r, eval=FALSE}
dat.gosh <- gosh(m.rma)
```
I can then display the the GOSH plot by pluggin the `dat.gosh` object into the `plot()` function.
```{r}
plot(dat.gosh, alpha= 0.1, col = "blue")
```
Interestingly, we see that there are **two patterns** in our data: one with lower effect sizes and lower heterogeneity, and one with higher effects, but considerable between-study heterogeneity: it seems like there are **two subclusters in our data**.
Now that we know the **effect size-heterogeneity** patterns in our data, the really important question of course is: which studies cause those patterns, and may thus belong to which subcluster? To answer this question we have developed the `gosh.diagnostics` function. This function uses **three clustering (also known as supervised machine learning) algorithms** to detect clusters in the GOSH plot data and determine which studies contribute the most to them automatically. The function is part of the [`dmetar`](#dmetar) package. If you have the package installed already, you have to load it into your library first.
```{r chunk.gosh, eval=FALSE}
library(dmetar)
```
If you don't want to use the `dmetar` package, you can find the source code for this function [here](https://raw.githubusercontent.com/MathiasHarrer/dmetar/master/R/gosh.diagnostics.R). In this case, *R* doesn't know this function yet, so we have to let *R* learn it by **copying and pasting** the code **in its entirety** into the **console** in the bottom left pane of RStudio, and then hit **Enter ⏎**. The function then requires the `dplyr`, `cluster`, `mvtnorm`, `factoextra`, `fpc`, `cowplot`, `reshape2`, and `flexmix` package installed and loaded in your library to function properly. We will use the default settings for the function here, but many more are available (see [documentation](https://dmetar.protectlab.org/reference/gosh.diagnostics.html)). We simply plug the `dat.gosh` object into the function. Again, as the there is a lot of data to process, the function may take need some time before it is finished.
```{r, echo=FALSE, message=FALSE, error=FALSE}
source("dmetar/gosh.diagnostics.R")
library(dplyr)
library(cluster)
library(mvtnorm)
library(factoextra)
library(fpc)
library(cowplot)
library(reshape2)
library(flexmix)
set.seed(123)
```
```{r, eval=FALSE}
gosh.diagnostics(dat.gosh)
```
```
Perform Clustering...
[==============================================================================================] DONE
Number of k-means clusters used: 3
Number of DBSCAN clusters detected: 3
Number of GMM clusters detected: 2
Identification of potential outliers
---------------------------------
Studies identified as potential outliers:
- K-means: 3 16
- DBSCAN: 3 10 16
- Gaussian Mixture Model: 3 16
```
<center>
![](_figs/KMEANS.png){width=350px} ![](_figs/DBSCAN.png){width=350px}
</center>
<center>
![](_figs/GMM.png){width=350px} ![](_figs/study3.png){width=350px}
</center>
<center>
![](_figs/study10.png){width=350px} ![](_figs/study16.png){width=350px}
</center>
We see that the **three algorithms**, $k$-means, DBSCAN and the Gaussian Mixture Model, have detected **three studies** which might potentially contribute to the **cluster imbalance**: study 3, study 10, and study 16. These studies seem to nearly fully account for the second high-effect size, high-heterogeneity cluster we found.
**Let us see what happens if we rerun the meta-analysis, this time removing these three studies.**
```{r}
metagen(TE,
seTE,
data=madata,
studlab=paste(Author),
comb.fixed = FALSE,
comb.random = TRUE,
method.tau = "SJ",
hakn = TRUE,
prediction=TRUE,
sm="SMD",
exclude = c(3, 10, 16))
```
We see that the between-study heterogeneity has **dropped to** $0\%$, indicating that the studies we are now analyzing **stem from one homogeneous (sub)population**, the lower cluster in our GOSH plot.
---