Work on eye movement desensitization and reprocessing (EMDR) has grown almost continuously since the first randomized clinical trials (RCTs) of the technique were published in 1994, both for post-traumatic stress disorder (PTSD) and other disorders (Marín et al., 2016). EMDR has been presented in a large number of studies as a superior treatment to other active treatments (Novo Navarro et al., 2018). However, many of the papers presented have significant associated methodological limitations (Institute of Medicine 2008). The present study is an update of the evidence for EMDR as a possible effective treatment for PTSD in the adult population. The majority of published studies present significant deficiencies in terms of rigor and methodological deficiencies when mixing child and adult populations, selecting studies with a high risk of bias, and including RCTs which mixed the population diagnosed with PTSD and the population diagnosed under the label traumatic memories.
The main objectives of this study are: (1) to review the current scientific production regarding the use of EMDR in PTSD; (2) to examine the degree of methodological quality and rigor of selected randomized clinical trials, which study the efficacy of EMDR in PTSD for review and subsequent meta-analysis; (3) to evaluate the efficacy of EMDR technique in the treatment of PTSD, using the effect size estimator as a measure of the magnitude of change produced; (4) to analyze the degree of effectiveness of EMDR in the reduction of anxious and depressive symptomatology associated with PTSD based on the magnitude of the change produced (effect size); (5) to verify and evaluate the possible degree of generalization of the results obtained through various statistical analyses.
Method
Participants
The total sample of the included and analyzed studies was 18 and consisted of N = 1213 subjects (M = 67.38; SD = 46.06), with an age range of 18-75 years (M = 37.38; SD = 3.38).
Procedure
This study followed the procedure established by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) publication guidelines to improve completeness in the reporting of systematic reviews and meta-analyses of randomized clinical trials (Page et al., 2021).
Literature Search
A systematic review of studies using the EMDR therapeutic method for the treatment of PTSD was carried out. These articles have been found from a detailed literature search with three phases in order to increase the likelihood of getting all possible relevant publications on the subject. The procedure used was similar to that used by Bradley et al. (2005). Firstly, the SCOPUS database was used to find the articles published by the journals with the highest impact. Secondly, the following databases were searched: Academic Search Complete, CINAHL, CSIC, CSIC, ERIC, Medline, Psycarticles, Psychology and Behavioral Sciences, PsycInfo, Pubmed and Web of Science. Using the thesaurus of the American Psychiatric Association (Psychological Index Terms) and the National Library of Medicine (Psy-MeSH) as a guide, the terms used in the search equation were a combination of the descriptors "emdr or eye movement desensitization reprocessing or eye movement and desensitization reprocessing", "ptsd or post traumatic stress disorder or posttraumatic stress disorder or post-traumatic stress disorder" and "therapy or treatment or intervention", joined using the Boolean operator AND. Finally, using the meta-analyses found in step two, clinical research that met the selection criteria were manually added with the aim of detecting those studies that had not been previously identified by the search engines.
Inclusion/Exclusion Criteria
Both the exclusion criteria and the inclusion criteria of this study were based on the objectives proposed in similar studies. Thus, studies were included in the meta-analysis with the following characteristics: (1) published between January 1991 and January 2022; (2) written in English or Spanish ; (3) original studies; (4) of quantitative type; (5) randomized clinical trials following RCT criteria (randomized clinical trials established by Cochrane); (6) patients had been diagnosed with PTSD, according to DSM-III-R, DSM-IV, DSM-IV- TR , DSM-5, ICD-10 or ICD-11 criteria; (7) treated with EMDR by health professionals trained in EMDR; (8) EMDR efficacy was investigated; (9) had at least one control group (patients receive no treatment or another treatment); (10) sample composed of adult population; (11) peer-reviewed.
Studies were excluded in which: (1) patients had another comorbid diagnosis or the diagnosis of PTSD could be attributed to the physiological effects of a substance (American Psychiatric Association, 2013); (2) studies in which EMDR was administered alongside another psychotherapeutic or pharmacological treatment; (3) studies with a Jadad score of less than three points; (4) did not present sufficient clinical measures or did not have adequate statistical analysis (effect size, number of subjects in the sample, t value, F value, odds ratio, p value, mean differences, and standard deviation); (5) quasi-experimental studies, case studies, single-group experimental studies, or qualitative studies
Data Analysis
Systematic Review
The RCT selection process is described in the flow chart (Page et al., 2021) (see Figure 1). After examination, 18 potentially relevant studies were selected for review and were screened using the Jadad methodological validity scale. The reliability between raters was tested using Cohen's k to avoid possible selection bias (Cohen, 1960), obtaining a high result k = .91. All the studies obtained a score equal to or greater than three points, with an average methodological quality of 4 points (Jadad et al., 1996). The characteristics of the selected studies are detailed in Table 1.
Authors | Jadad | Type of trauma | Intervention | Sample | Months of follow-up/ NSP | PRO |
---|---|---|---|---|---|---|
(Population) | Experimental/Control | No. of participants (women) Age | (% SCTS) Minutes / sessions | |||
Acarturk et al. (2016) | 4 | War (Syrian refugees) | EMDR / Waiting list | Total: 98(73) | 1 Month | PSIC |
2 ≥ sessions | ||||||
POST FOLLOW | ||||||
49 (39) | (75,5%)→(63,2%) | |||||
49 (34) | (67,3%)→(67,3%) | |||||
Average age (Years): 38.54 | 90 min | |||||
Boterhoven de Haan et al. (2020) | 5 | Childhood Trauma (Australia, Germany and the Netherlands) | EMDR / ImRs | Total: 155 (119) | 12 Months | PSIC |
12 sessions | ||||||
POST FOLLOW | ||||||
81 (65) | (82,71%)→(69,13%) | |||||
74 (54) | (86,48%)→(66,21%) | |||||
Average age (Years): 33.68 | 90 min | |||||
Carlson et al. (1998) | 3 | Vietnam war veterians (USA) | EMDR/ Relaxation- Biofeedback/ Regular clinical care | Total: 35 | 3 – 9 Months/ | TERA |
12.2 sessions | ||||||
FOLLOW FOLLOW | ||||||
10 | (100%)→(80%) | |||||
13 | (100 %)→(30.7%) | |||||
12 | (100%) | |||||
Average age (Years): 48.04 | 60-75 min | |||||
Devilly & Spence (1999) | 3 | Mixed (Australia) | EMDR/ TTP | Total: 22 (17) | 3 Months | TERA |
7 sessions | ||||||
POST FOLLOW | ||||||
11 (8) | (45.4%)→(45.4%) | |||||
12 (7) | (75%)→(75%) | |||||
Average age (Years): 37.96 | 90 min | |||||
Högberg et al. (2007) | 5 | Traffic Accident/ Assault (Sweden) | EMDR/ Waiting list | Total: 24 (5) | *35 Months | PSIC |
5 sessions | ||||||
13 (3) | (92,3%) | |||||
11 (2) | (81,8%) | |||||
Average age (Years): 43 | 90 min | |||||
Ironson et al. (2002) | 3 | Mixed/ (USA) | EMDR/ Prolonged exposure | Total: 22 (17) | 3 Months | PSIC-ST |
5 sessions | ||||||
POST FOLLOW | ||||||
10 (?) | (100%)→ (60%) | |||||
12 (?) | (75%)→ (50%) | |||||
Average age (Years): 16-62 | 90 min | |||||
Karatzias et al. (2011) | 5 | Mixed (Scotland) | EMDR/ Emotional release techniques | Total: 46 (26) | 3 Months | PSIC PSIQ |
12 sessions | ||||||
POST FOLLOW | ||||||
23 (14) | (56,5%)→(47.8%) | |||||
23 (12) | (60,8%)→(52,1%) | |||||
Average age (Years): 40.6 | 90 min | |||||
Lee et al. (2002) | 3 | Mixed (Australia) | EMDR/ SITPE | Total: 24 (11) | 3 Months | PSIC |
8 sessions | ||||||
POST FOLLOW | ||||||
12 (?) | (100%)→(100%) | |||||
12 (?) | (100%)→(100%) | |||||
Average age (Years): 35.3 | 60 min | |||||
McGuire et al. (2020) | 5 | Mixed (Australia) | EMDR/ Prolonged exposure | Total: 20 (?) | 6 Months | TERA |
8 sessions | ||||||
POST FOLLOW | ||||||
10 (?) | (100%)→(70%) | |||||
10 (?) | (100%)→(80%) | |||||
Average age (Years): 42.15 | 60 min | |||||
Nijdam et al. (2012) | 5 | Mixed (Netherlands) | EMDR/ Brief electric therapy | No follow-up | PSIQ-ST | |
Total: 140 (72) | 15 sessions | |||||
70 (36) | (74,2%) | |||||
70 (36) | (71,4%) | |||||
Average age (Years): 37.8 | 90 min | |||||
Nijdam et al. (2018) | 4 | Mixed (Netherlands) | EMDR/ Brief electric therapy | No follow-up | PSIQ-ST | |
Total: 116 | 6.64 sessions | |||||
(61) | ||||||
57 (28) | (75.43%) | |||||
59 (33) | (64.4%) | |||||
Average age (Years): 38.53 | 90 min | |||||
Power et al. (2002) | 3 | Mixed (Scotland) | EMDR/ Exposure + cognitive restructuring/ Waiting list | Total: 72 (30) | 15 Months | TERA |
4.2 sessions | ||||||
POST FOLLOW | ||||||
27 (12) | (69.23%)→(56.4%) | |||||
21 (8) | (56.75%)→(45.9%) | |||||
24 (10) | (82.76%)→( No follow-up) | |||||
Average age | 90 min | |||||
Rogers et al. (1999) | 3 | 3 Vietnam war veterians (USA) | EMDR/ Exposure | No follow-up | TERA | |
Total: 12 | 1 sessions | |||||
(100 %) | ||||||
6 | (100 %) | |||||
6 | ||||||
Average age (Years): 47-53 | 60-90 min | |||||
Rothbaum et al. (2005) | 5 | Rape victims (Georgia) | EMDR/ Prolonged exposure /Waiting list | Total: 72 (72) | 6 Months | PSIC- ST |
9 sessions | ||||||
POST SEGUI | ||||||
25 (25) | (80%)→(76%) | |||||
23 (23) | (86%)→(78.2%) | |||||
24 (24) | (83%)→( No follow-up) | |||||
Average age (Years): 33.8 | 90 min | |||||
Taylor et al. (2003) | 3 | Mixed (Canada) | EMDR/ Exposure therapy/ Relaxation therapy | Total: 60 (45) | 3 Months | TERA |
8 sessions | ||||||
POST FOLLOW | ||||||
19 (12) | (78.9%)→(78.9%) | |||||
22 (8) | (68.1%)→(68.1%) | |||||
19 (10) | (78.9%)→(78.9%) | |||||
Average age (Years): 37 | 60-90 min | |||||
Ter Heide et al. (2016) | 4 | War (Syrian refugees) | EMDR/ Usual mental health treatment in refugee centers | Total: 72 (20) | 3 Months | PSIC-ST |
12 sessions | ||||||
POST FOLLOW | ||||||
36 (6) | (83,3%)→(69,4%) | |||||
36 (14) | (77,7%)→(63,8%) | |||||
Average age (Years): 20.93 | 60 min | |||||
van der Kolk et al. (2007) | 5 | Mixed (USA) | EMDR/ Fluoxetine/ Placebo | Total: 88 (55) | 6 Months | PSIC PSIQ |
6 sessions | ||||||
POST FOLLOW | ||||||
29 (22) | (82,7%)→(72,4%) | |||||
30 (26) | (86,7%)→(60%) | |||||
29 (25) | (89,6%)→( No follow-up) | |||||
Average age (Years): 36.1 | 90 min | |||||
van Vliet et al. (2021) | 4 | Childhood Abuse (Netherlands) | EMDR/ STAIR | Total: 135 (83) | 6 Months | PSIC |
16 sessions | ||||||
67 (43) | POST FOLLOW | |||||
68 (40) | (80%)→(80%) | |||||
(64.7%)→(64.7%) | ||||||
Average age (Years): 18-65 | ||||||
90 min |
Note.NSP = number of sessions per patient in the EMDR condition; PRO = professional who provided treatment; %SCTS = percentage of subjects who completed all follow-up measure collections; POST = percentage of subjects who completed all posttreatment measures; SEGUI = percentage of subjects who completed all follow-up measures; EH = heteroadministered scales; EA = self-administered scales; PSIC = clinical psychologist specializing in mental health; PSIQ = psychiatrist specializing in mental health; PSIQ = psychiatrist specializing in mental health; PSIC-ST = master's or doctoral student in clinical psychology; PSIQ-ST = psychiatry resident; TERA = therapist; EMDR = eye movement desensitization and reprocessing; ImRs = Imagery rescripting; SITPE = Stress inoculation training with prolonged exposure treatment; CBT = Cognitive behavioral therapy; TPR = Cognitive behavioral treatment protocol for trauma; NE = Not specified;
*= (Högberg et al., 2008);
?= Not reported.
A bibliometric analysis of the articles selected via the meta-analysis was performed. The mean age of the articles was 13.22 ± 8.06 CI 95% [6.05, 12.08]. The newest article was one year old and the oldest was 24 years old. To know the obsolescence of the articles, we used the Prince index (percentage of articles less than 5 years old), with a result of 22.22%, and the Burton-Kebler index (using the median) with a result of 15 years.
Experimental Dropout Analysis
The relative risk of possible differences in prematurely leaving treatment between the EMDR-treated and control groups was analyzed using MedCalc's relative risk calculator (MedCalc, 2023). No differences were obtained between the number of participants who dropped out of treatment in both groups (relative risk = 1.04, 95% CI [0.98, 1.1]; p = .13).
Bias Risk
Figures 2 and 3 provide data on publication bias measured with the Cochrane Collaboration's Tool for Assessing Risk of Bias (Higgins et al., 2011) which classifies studies according to high, moderate and low risk into six domains.
Results
Meta-analysis of Intervention Effects: Effect Size and Statistical Heterogeneity, Analysis of Population Prediction Intervals and Sensitivity Analysis
To estimate the efficacy of EMDR intervention in trials, effect sizes were calculated from the standardized mean difference (Hedges' g) (Rosenthal, 1991), with 95% confidence interval. Hedges' g was used instead of Cohen's d as it provides a more precise estimate with small samples (Grissom & Kim, 2005). RevMan 5.4 (The Cochrane Collaboration, 2020) and Meta-Essentials statistical software were used (Van Rhee et al., 2015). Effect sizes were calculated using the random effects model which estimates the overall effect size, assuming that studies are a sample of the totality of studies and/or when studies are heterogeneous. In addition, between-study heterogeneity was calculated based on the X² test (Q test), with a 95% confidence interval (p values > .05 indicate an absence of between-study heterogeneity). The I statistic2 was also used to calculate between-study heterogeneity. To detect outliers in the effect sizes in each analysis, the Galbraith plot was used. Those measures that fell outside the confidence interval area were considered outliers and were removed from the meta-analysis. Thus, the study by Acarturk et al. (2016) was removed from the analyses as it was an outlier. The 95% prediction interval of the different random-effects meta-analyses was analyzed in order to facilitate the generalization of the results to clinical practice, a more uniform and accurate estimation of the results. In addition, a sensitivity analysis was performed on each meta-analysis to assess the influence of each individual study on the total effect size and the random-effects model used was compared with the fixed-effect (FE) model. The aim is to determine the robustness of the observed results to the assumptions made when conducting the analysis (Figures 14-23). Included in the scientific repository GREDOS with link https://gredos.usal.es/handle/10366/151051
The meta-analysis of traumatic symptoms at post-treatment included 21 comparisons. EMDR therapy produced a decrease in trauma-associated symptoms at post-treatment with a small, statistically significant effect size and moderate, statistically significant heterogeneity (g = 0.33, z = 3.07, p = .002, 95% CI [0.12, 0.54], PI [-0.47, 1.10]); (Q = 0.12, p = .001; I² = 56%); (FE, g = 0.29). The meta-analysis of PTSD symptoms at maintenance had 11 comparisons. It showed a small, non-statistically significant effect size and moderate, statistically significant heterogeneity (g = 0.02, z = 0.09, p = .93, 95% CI [0.31, 0.34], PI [-1.09, 0.74]; (Q = 0.14, p = .01; I² = 55%); (FE, g = -0.01).
The meta-analysis of depressive symptoms at post-treatment found 19 comparisons and 12 comparisons at maintenance. EMDR therapy produced a decrease in trauma-associated depressive symptoms with a small effect size at posttreatment (statistically significant) and at maintenance (not statistically significant) (g = 0.43, z = 3.33, p = .01, 95% CI [0.18, 0.69], PI [-0.73, 1.14]); (EF, g = 0.39); (g = 0.13, z = 0.79, p = .43, 95% CI [0.19, 0.44], PI [-0.84, 0.9]) (EF, g = 0.1). The analysis presented moderate heterogeneity at post-treatment and maintenance, both statistically significant (Q = 0.19, p < .0001; I² = 65%); (Q = 0.15, p = .02; I² = 52%).
The meta-analysis of anxious symptoms at posttreatment included 11 comparisons. EMDR therapy produced a decrease in trauma-associated anxiety symptoms with a statistically significant moderate effect size and statistically significant moderate heterogeneity (g = 0.53, z = 3.1, p = .003 95% CI [0.19, 0.86], PI [-0.67, 1.55]); (Q = 0.19, p = .003; I² = 62%); (FE, g = 0.48). The maintenance meta-analysis included 5 comparisons. EMDR therapy did not produce a decrease in trauma-associated anxious symptoms vs. control groups in each comparison and had a non-statistically significant effect size and non-statistically significant moderate heterogeneity (g = -0.11, z = 0.4, p = .69 95% CI [-0.66, 0.44], PI [-1.21, 1.34); (Q = 0.2, p = .07; I² = 54%); (FE, g = -0.17) (Figures 4-13). Included in the scientific repository GREDOS with link https://gredos.usal.es/handle/10366/151051
Analysis of the prediction intervals of all meta-analyses showed results of less than 0 indicating that EMDR will sometimes not be useful in at least 95% of study effects. Sensitivity analysis revealed that changes in size by changing the statistical model or removing any studies did not affect the overall effect size.
Publication Bias
Analysis for publication bias aims to check the overestimation of the effect size in the results due to the scarce publication of studies with non-significant results. There is a generalized tendency, on the part of scientific journal publishers, to publish mostly those investigations that report significant effects versus those that do not show significant effects in their results (Rosa Garrido, 2016). A conservative method of addressing this problem is to calculate the "fail-safe N" and assume that the effect sizes of all current or future unpublished studies are equal to 0 and calculate the number of such studies needed to reduce the overall effect size to a non-significant level (Rosenthal & Rubin, 1988). To perform a more precise estimation, we found the funnel plot of the effect sizes of each meta-analysis, ruling out publication bias if the data were symmetrically distributed. In addition, we calculated the number of studies needed to correct for funnel plot asymmetry using the Trim and Fill method. To complement these results we calculated, by means of two statistical methods, the asymmetry of the funnel plots, Egger's test and Begg and Mazumdar's adjusted rank test (Figures 24-29). Included in the scientific repository GREDOS with link https://gredos.usal.es/handle/10366/151051
Another publication bias analysis was performed on the studies included in the meta-analysis. All analyses showed a "fail-safe N" n = 0 except for the analysis of anxious symptoms in the post-treatment with "fail-safe N" n = 3. The "funnel plot" of the analyses in the post-treatment of anxious and depressive symptoms were asymmetrical, the rest of the analyses presented symmetrical "funnel plot". Using the "Trim and Fill" method, we estimated the need to incorporate four more studies and six more studies to correct their asymmetry respectively. Both Egger's regression test and Begg and Mazumdar's adjusted rank correlation test revealed an absence of publication bias in the study of traumatic symptoms at post-treatment (t = 0.62, p = .54) (z = -0.19, p = . 42) and at maintenance (t = 1.3, p = .22) (z = -0.54, p = .29); studies measuring depressive symptoms at maintenance (t = 1.07, p = .3) (z = 0, p = .5); and studies measuring anxious symptoms at post-treatment (t = 1.46, p = .17) (z = 1.32, p = .09) and maintenance (t = 0.18, p = .87) (z = 0, p = .5). In the study of depressive symptoms at post-treatment, conflicting results were found (t = 3.17, p = .006) (z = -0.87, p = .19) so the results are inconclusive about possible publication bias.
Subgroups and Moderating Variables Analysis
The effect size of the ?γτ; 60 min therapy group was significantly larger in the analysis of PTSD symptoms at posttreatment (g = 0.38) (QB = 0.45; p = .002), depressive symptoms at posttreatment (g = .5) (Q = 1.3; p = .0008), and anxious symptoms at posttreatment (g = 0.56) (Q = 3.96; p = .0008).5) (QB = 1.3; p = .0008) and on anxious symptoms at post-treatment (g = 0.56) (QB = 3.96; p = .004) than the group with a therapy ≤ 60 min (g = 0.06); (g = 0.08); (g = 0.31) respectively.
In contrast, in the analysis of traumatic symptoms the effect size of the ≤ 8 sessions group (g = 0.25) was somewhat smaller (QB = 0.45; p = .05) than the ?γτ; 8 sessions group (g = 0.44; QB = 0.45; p = .02).
In addition, the effect size of the < 8 sessions group was larger in the analysis of depressive (g = 0.75) (QB = 1.3; p = .0003) and anxious (g = 0.68) (QB = 3.96; p = .03) symptoms at posttreatment than the ≥ 8 sessions group (g = 0.26); (g = 0.45; p = .03).
Regarding treated subjects age, those aged ≤ 40 years showed significantly smaller effect sizes at posttreatment for both traumatic (g = 0.31; QB = 0.44; p = .03) as well as depressive (g = 0.37; QB = 1.06; p = .01) and anxious (g = 0.47; QB = 3.96; p = .03) than the subjects with an age > 40 years (g = 0.31; QB = 0.44; p = .03); (g = 0.57); (g = 0.67; QB = 3.96; p = .03).
Studies with a methodological quality of 3 points using the Jadad scale showed significantly larger effect sizes at post-treatment for traumatic (g = 0.69; QB = 0.45; p = .0004) and anxious symptoms (g = 0.79; QB = 3.96; p = .0002) than those studies with a methodological quality of > 3 points (g = 0.16; QB = 0.45; p = .05); (g = 0.29).
EMDR was significantly superior to wait-list control or non-active treatment in the analysis of traumatic (g = 0.67; QB = 0.45; p = .02), depressive (g = 0.88; QB = 1.3; p = .002), and anxious (g = 0.61; QB = 3.96; p = .03) symptoms compared to active treatment (g = 0.21; QB = 0.45; p = .04); (g = 0.24); (g = 0.49; p = .03).
At posttreatment, studies that had behavioral therapy or cognitive-behavioral therapy as a control group had smaller effect sizes than those that did not in the analysis of traumatic (g = 0.09) vs (g = 0.42; QB = 0.45; p = .00008), depressive (g = 0.16) vs (g = 0.54; QB = 1.3; p = .00001) and anxious (g = 0.34) vs (g = 0.56; QB = 3.96; p = .00007) and maintenance in the analysis of depressive (g = 0.47; QB = 0.4; p = .04) vs (g = 0.31) and anxious (g = -0.7; QB = 0.21; p = .02) vs (g = 0.22) symptoms, respectively. Showing, in the latter two groups, more effective than EMDR.
At the same time, the multidisciplinary therapist group (psychologists, psychiatrists, or other type of psychotherapist) had a somewhat larger effect size in the analysis of posttraumatic (g = 0.47) (QB = 0.45; p = .005) and depressive symptoms (g = 0.48) (QB = 1.3; p = .005) at post-treatment than that composed solely of psychologists (g = 0.13); (g = 0.37; QB = 1.3; p = .0006). In contrast, the post-treatment analysis of anxious symptoms had a somewhat smaller effect size (g = 0.53) than the psychologist-only analysis (g = 0.54), although the results were not statistically significant.
The group treated by professional therapists (g = 0.28) showed a significantly smaller effect size (QB = 0.45; p = .02) than those treated by students (g = 0.44) in the post-treatment analysis of traumatic symptoms. In contrast, post-treatment analysis of depressive and anxious symptoms showed a significantly larger effect size than those treated by students (g = 0.5; QB = 1.3; p = .00009) vs. (g = 0.28); (g = 0.64) (QB = 3.96; p = .005) vs. (g = 0.38).
Studies with > 50% female sample showed significantly higher effect sizes at posttreatment for traumatic (g = 0.65; QB = 0.41; p = .03) and depressive symptoms at post-treatment (g = 0.31; QB = 1.13; p = .005) and maintenance (g = 0.49; QB = 0.47; p = .04) relative to those studies with a sample of women ≤ 50 (g = 0.15); (g = 0.22; QB = 1.13; p = .01); (g = 0.18); and a significantly smaller effect size in the analysis of anxious symptoms at posttreatment (g = 0.22) and at maintenance (g = -0.49; QB = 0.47; p = .04) relative to those studies with a sample of women ≤ 50 (g = 0.65; QB = 3.91; p = .01); (g = 0.04).
The war-related group of patients showed more benefit in the post-treatment analysis of traumatic (g = 0.65; QB = 0.45; p = .03), depressive (g = 0.39; QB = 1.3; p = .005) and anxious (g = 0.39; QB = 1.3; p = .005) symptoms than the unrelated group (g = 0.28; QB = 0.45; p = .02); (g = 0.73); (g = 0.73). Statistically significant differences were also found between the group of patients with war-related PTSD (g = -0.49; QB = 0.21; p = .04) and the not-related group (g = -0.52) in the analysis of maintenance of anxious symptoms.
Articles published in 2007 or earlier on both posttreatment of traumatic (g = 0.54; QB = 0.45; p = .001), depressive (g = 0.58; QB = 1.3; p = .001), and anxious (g = 0.6; Q = 3.96; p = .007) symptoms showed significantly larger effect sizes than those published after 2007 (g = 0.09; Q = 3.96; p = .007).001) and anxious (g = 0.6; QB = 3.96; p = .007) showed a significantly larger effect size than those others published after 2007 (g = 0.09); (g = 0.18); (g = 0.28 ) (Tables 2-4). Included in the scientific repository GREDOS with link https://gredos.usal.es/handle/10366/151051
Meta-regression Analysis
Meta-analyses of PTSD, depression, and anxiety symptoms at both posttreatment and maintenance were analyzed using unrestricted maximum likelihood meta-regressions to test the effect of a set of continuous variables (study publication year, number of treatment sessions, treatment duration, and sample size) on effect size. Meta-regression for the parameters treatment duration and sample size were statistically significant and positively sloped (β = 0.32; p = .02); (β = 0.08; p = .001) in relation to effect size respectively. The rest of the results found were not statistically significant (Table 5, included in the scientific repository GREDOS with link https://gredos.usal.es/handle/10366/151051).
Certainty of Evidence
The degree of certainty of each comparison was analyzed following the GRADE (Grading of Recommendations Assessment, Development and Evaluation) methodology with the aim of reporting the degree of confidence in the recommendation of the data (Schumemann et al., 2013) (Table 6, Included in the scientific repository GREDOS with link https://gredos.usal.es/handle/10366/151051).
Discussion
Among the main findings of the meta-analysis N = 18, in comparison with other meta-analyses published in the last 10 years, the following stand out: In the analysis of PTSD symptom reduction, a small effect size was reported at both post-treatment and maintenance, similar to those by Haagen et al. (2015) and Cuijpers et al. (2020) (although at maintenance they reported a moderate effect size), as opposed to large or moderate effect sizes in other meta-analyses such as those by Chen et al. (2014) (included studies of both infant-juvenile and adult populations), Cusack et al. (2016), Watts et al. (2013) and Ehring et al. (2014) (whose sample consisted solely of patients with abuse during childhood). However, some of these meta-analyses did not include a population with a complete diagnosis of PTSD, mixed subjects under the construct traumatic memories with patients diagnosed with PTSD or included studies without sufficient methodological quality, thus their effect size may have been overestimated.
Regarding the comparison of EMDR vs. CBT in reducing post-traumatic symptoms, EMDR was shown to be superior to CBT as did the studies by Khan et al. (2018) and Cuijpers et al. (2020), with a small effect size at post-treatment, although at maintenance a non-statistically significant small effect size in favor of CBT was found, as opposed to the small effect size in favor of EMDR found in Khan et al. (2018) and Cuijpers et al. (2020).
Similarly, effect size reported for depressive symptoms was small and therefore differed from the results found in other meta-analyses (Chen et al. 2014; Cusack et al. 2016). Regarding the effect size obtained on anxious symptoms it was moderate similar to the study by Chen et al. (2014).
On comparing EMDR vs. CBT, in reducing anxiety symptoms, small effect sizes were found, unlike the large effect size reported by Khan et al. (2018) in favor of EMDR. Similar to Khan et al. (2018), a small effect size was reported for depressive symptom reduction in favor of EMDR. Of note, the effect sizes for depressive and anxious symptoms were not statistically significant.
Subgroup analysis shows that those studies conducted by professional therapists, with a larger number of sessions, longer than 60 minutes, subjects older than 40 years, a non-active control group, a less rigorous methodology, published before 2007 and a war related population presented higher effect sizes, regardless of the type of professional who conducted the therapy. These findings appeared across the board in the analysis of PTSD, anxiety and depressive symptoms at post-treatment. Regarding the comparison of EMDR with therapies related to the spectrum of behavior modification therapies or CBT in the reduction of post-traumatic symptoms, a moderate and significant effect size in favor of EMDR was obtained at post-treatment, although at maintenance behavioral therapies and CBT were more effective (without statistical significance). In contrast, behavioral modification therapies or CBT were more effective than EMDR in the maintenance of anxious and depressive symptoms. When comparing the results obtained in the subgroup analysis with those presented by Chen et al. (2014), certain similarities stand out: EMDR was more effective in sessions of > 60 min and conducted by multidisciplinary therapists in the reduction of post-traumatic, depressive and anxious symptoms. Although the reported effect sizes were significantly higher than those obtained in the present meta-analysis.
When comparing the effect of EMDR in post-treatment to a non-active control, a moderate effect size was obtained in contrast to the large effect size reported by Morina et al. (2021) and Thompson et al. (2018), although this latter had a sample of only two studies. Similar to Morina et al. (2021) and Thompson et al. (2018), a small effect size in favor of EMDR was obtained when compared to an active control group.
Similarly, in the meta-analysis by Cuijpers et al. (2020) the effect size of RCTs on PTSD symptom reduction in post-treatment with lower methodological quality was larger with a moderate effect size than those with higher methodological quality that obtained a small effect size. Perhaps the difference between the results of this meta-analysis and those by the majority of meta-analyses presented which address the efficacy of EMDR for post-traumatic stress disorder is, either centrally or peripherally, due to much more restrictive inclusion/exclusion criteria in terms of the relevance and methodological quality of the papers.
Before rendering these results to the clinical setting, a number of limitations should be considered. First, the meta-analysis included only eighteen randomized clinical trials, which may have contributed to the low level of effect size compared to other meta-analyses cited above. Secondly, the number of subjects in some of the selected RCTs was very low. Thirdly, the studies included in the meta-analysis were highly heterogeneous in terms of the type of control groups. As the studies differed in many factors (number and duration of sessions, training of therapists, type of population...), it is complex to attribute any differences in effect sizes solely to the therapeutic approach that could affect the final effect size. Fourthly, the different scales of the included papers could also affect the overall effect size and the results of the subgroup analysis. Fifth, in some RCTs patients received few EMDR sessions. Eight of the included studies had < 8 sessions. This number of sessions might be insufficient to properly apply the standard EMDR protocol, with eight phases, to such a complex psychological problem as PTSD. Sixth, due to the small number of studies, we were unable to conduct a multivariate analysis in the subgroups that could explain the contribution of the factors used. Seventh, the analyses conducted in maintenance had a smaller number of studies and thus participants. Several studies did not have the necessary measures for PTSD symptoms, depression and/or anxiety or did not follow up on these measures. It would be essential for future RCTs to have sustained and adequate maintenance over time in order to be able to infer more accurately whether the changes made in therapy are maintained over time. Eighth, there are difficulties in classifying the different treatments under a complete theoretical model. Introducing a treatment into one or the other category could alter the analysis of the subgroups. Our criterion was based on the label each treatment received in its corresponding study. Ninth, the effect sizes found show moderate heterogeneity, but smaller than other meta-analyses, so the results should be taken with caution. Tenth, the selected RCTs showed, according to the publication bias criteria of the Cochrane Collaboration's Tool for Assessing Risk of Bias, high risk of bias or unclear risk in some of their sections. Risks of bias that may have affected the results of the present meta-analysis included: missing or incomplete data provided by most RCTs; high experimental mortality rates, although we did not statistically find a relative risk of dropout. High dropout rates are usually common in RCTs in which a psychological treatment is applied to some pathology. However, lax methods were used in the treatment of data affected by experimental mortality. It is possible that the very characteristics of PTSD in which avoidance symptoms predominate could affect the experimental mortality rate; some RCTs do not describe the method of blinding of raters or the method of blinding to the results could not be maintained throughout the experiment or during maintenance. Thus, the methodological limitations of the studies may have contributed to an overestimation of the effect size in favor of EMDR. A meta-analysis will always be limited in its power of inference by the studies that contain it.
Despite the small number of publications in the meta-analysis, the results revealed that EMDR could be a useful psychotherapeutic approach, albeit with small effects, in the treatment of PTSD and associated anxious and depressive symptoms. Overall, treatments that had more than 60 min per session, had more than 8 sessions, and had a professional therapist contributed to a significant reduction in symptoms. However, randomized clinical trials of the PTSD population with a rigorous methodological quality would be necessary to clearly conclude that these results have statistical power and external validity. The results show an overestimated effect size for studies with poorer methodological quality in favor of EMDR. The lack of reported data, the method of blinding, conflicts of interest in relation to the technique of some authors and the commitment of loyalty among the evaluators may have played a fundamental role in this overestimation. Therefore, due to the methodological quality of some studies, it is difficult to draw any conclusions.
It would be interesting for future RCTs to research the optimal way to apply EMDR for PTSD. Finally, it is worth mentioning that there is a general tendency in research to not publish non-statistically significant and non-confirmatory results. Although no publication bias was found using the "fail safe-N" method, we found a certain asymmetry in the meta-analyses of anxious and depressive symptoms in post-treatment, estimating a deficiency of published studies in both analyses using the Trim and fill method. Precisely these meta-analyses of anxious and depressive symptoms in treatment had the highest reported effect sizes among the calculations performed.
Nor can we determine the superiority of EMDR over other behavior modification therapies or CBT in relation to post-traumatic anxiety and depressive symptoms in maintenance. The long-term effects of EMDR seem controversial and there is not enough evidence to recommend the use of EMDR in people with PTSD.
In conclusion, we wonder insofar as we are putting research efforts into the underlying question, i.e., not whether EMDR is effective for PTSD, but what is its functioning, what is the underlying rationale that makes the technique work? Are we researchers falling into confirmation bias? Are we studying psychological variables as natural rather than interactive variables? This paper does not attempt to resolve the mechanism of action behind EMDR. In line with Cuijpers et al. (2020), thinking that EMDR works only from the behavioral or cognitive-behavioral component of the technique seems reductionist, ignoring the holistic nature of the eight phases. Some authors such as Pérez Alvarez (2021) have proposed an alternative explanation from a contextualist-anthropological perspective. This is a problem not just for EMDR but also for the TCC categorized by consensus as Gold standard; we do not know the active mechanism of operation. Are we really using the right methodology to achieve higher degrees of evidence in the treatment of people with psychological problems?