Introduction
Plausibility: A Verbal Cue to Veracity worth Examining?
In 2003 Bella DePaulo et colleagues (DePaulo et al. (2003) published their meta-analysis examining nonverbal and verbal cues to deceit. It included 120 samples examining 158 cues. Fifty out of these 158 cues were examined more than five times, including plausibility, which was examined nine times. Plausibility was significantly related to veracity with truth tellers' stories sounding more plausible than lie tellers' stories. Although the effect size was small (d = 0.23), it was larger than the effect sizes of most other cues. In fact, plausibility emerged as the eighth strongest indicator in the list of 50 cues (DePaulo et al., 2003, Table 8).
Given that plausibility was more strongly related to veracity than most other verbal cues, someone would expect researchers to have included plausibility in the set of verbal cues they examine when assessing veracity. This did not happen. To date at least six frequently cited verbal veracity assessment protocols exist, but in five of them plausibility is not included: Assessment Criteria Indicative of Deception (ACID; Colwell et al., 2013; Colwell et al., 2007); Criteria-based Content Analysis (CBCA; Amado et al., 2016; Köhnken, 2004; Köhnken & Steller, 1988); Cognitive Credibility Assessment (CCA; Vrij, Fisher, et al., 2017; Vrij et al., 2015); the Strategic Use of Evidence (SUE; Granhag & Hartwig, 2015; Hartwig, et al., 2014); and the Verifiability Approach (VA; Nahari & Vrij, 2014; Vrij & Nahari, 2019). Plausibility is sometimes included in one tool, Reality Monitoring (RM; Masip et al., 2005; Sporer, 2004; Sporer et al., 2020), under the term 'realism'. In a recent study (Sporer et al., 2020, study 2) plausibility emerged as the strongest indicator of veracity out of the eight RM cues that were examined.
We do not know why other researchers do not include plausibility in their protocols, but for us subjectivity of coding is the main reason to exclude it. Subjectivity means that we cannot explain to practitioners how to use plausibility as a verbal veracity cue. However, ignoring plausibility could be considered a shortcoming, not only because research has shown that plausibility has potential as a veracity assessment cue but also because in our conversations with practitioners about verbal cues to deception they frequently ask us about this cue. Against this background we decided to start examining plausibility in more detail, resulting in the current project which should be seen as a first step. In the current project we explored to what extent plausibility could be predicted by verbal cues that are coded more objectively and that discriminate truth tellers from lie tellers according to research. If plausibility could be predicted by such cues, we would be one step closer to making the concept of statement plausibility more objective. That is, observers could be instructed to consider these objective cues when judging plausibility.
We checked our datasets and found five in which we examined plausibility and two other verbal cues we thought may be related to it: total details and complications (Deeb, Vrij, Leal, et al., 2020; Leal et al., 2019; Leal et al. 2015; Vrij, Leal, Deeb, et al. 2020; Vrij, Leal, Fisher, et al., 2020). In two of these datasets (Leal et al., 2019; Vrij, Leal, Deeb, et al. 2020) an additional cue was examined which we also thought to have potential in explaining plausibility: verifiable sources.
DePaulo et al. (2003) defined plausibility as "the degree to which the message seems plausible, likely, or believable" (p. 113). Another definition used in the literature that avoids the word 'plausible' in its definition is "how likely it is that the activities happened in the way described" (Leal et al., 2019, p. 278). To judge how likely it is that activities happened in the way they are described, it is useful to take contextual information into account. Contextual information can be present in at least two forms (Blair et al., 2010). First, statements can be compared with independent evidence such as CCTV footage (statements that contradict independent evidence are considered implausible). This is a compelling way to detect deception (Vrij & Fisher, 2016) and the SUE technique is based on this principle (Granhag & Hartwig, 2015). However, this use of plausibility is possible only when independent evidence is available, which is not always the case. Second, statements can be judged in terms of what is conventional or reasonable in a given situation (unconventional or unreasonable activities are considered implausible). Someone can always make this type of comparison and it has shown good potential for lie detection. Blair et al. (2010) conducted a series of experiments in which some participants were given contextual information (e.g., being told that the questions were very difficult to answer prior to deciding whether someone with a high score had cheated in a test), whereas other participants were not given contextual information. Observers who received contextual information performed better in detecting truths and lies (75%) than observers who did not receive such information (57%). However, comparing a statement against the "conventional" or "reasonable" could become subjective if observers disagree on what is conventional or reasonable.
The objective verbal cues we considered were details, complications, and verifiable sources. Details refer to the meaningful units of information in a statement. Truth tellers typically report more details than lie tellers (Amado et al., 2016). Lie tellers lack the cognitive resources to fabricate enough details (Köhnken, 2004) or are unwilling to report many details out of fear that the details they provide result in leads to investigators (Nahari et al., 2014). A complication is an occurrence that affects the story teller and makes a situation more complex ("The air conditioning was not working properly in the hotel, which made the room far too hot.") Truth tellers typically report more complications than lie tellers (Amado et al., 2016). Making up complications requires cognitive resources, but lie tellers may not have adequate cognitive resources to do so (Köhnken, 2004). Besides, adding complications makes the story more complex, which conflicts with lie tellers' inclination to keep their stories simple (Hartwig et al., 2007). The fear that the provided information results in leads for investigators also results in lie tellers reporting fewer verifiable sources than truth tellers (Leal et al., 2019). Verifiable sources refer to sources mentioned in a statement that could be consulted to check the veracity of a statement, such as named witnesses, receipts, and CCTV footage.
All three cues may be related to statement plausibility. Regarding details, people typically underestimate forgetting (Harvey et al., 2019; Koriat et al., 2004; Kornell & Bjork, 2009), which means that they expect others to be able to provide many details when they are asked to provide a detailed account of an activity. Therefore, a detailed account of an activity will be considered plausible and an account that provides few details will be considered unconventional and therefore implausible. Complications frequently occur (Vrij, Mann, et al., 2020; Vrij & Vrij, 2020) and observers will remember similar experiences when other people report them. This will make their stories sound plausible. The absence of complications in an account will be seen as an abnormally smooth report of an activity. People are often able to back up their activities with evidence. That is, they have met a named person, there is footage of the activity (CCTV or photos), the activity is documented (use of phone, bank cards, receipts), etc. Activities will thus be considered more plausible when such verifiable sources are reported. Activities that, according to the interviewee's account, took place in a vacuum of evidence will be considered less plausible.
Method
We re-analysed five datasets (Deeb, Vrij, Leal, et al., 2020; Leal et al., 2019; Leal et al., 2015; Vrij, Leal, Deeb, et al. 2020; Vrij, Leal, Fisher, et al., 2020). In all five datasets, details, complications, and plausibility were examined, and in two datasets (Leal et al., 2019; Vrij, Leal, Deeb, et al., 2020) also verifiable sources were examined. In all five experiments plausibility was defined as "how likely it is that the activities happened in the way described" and was measured subjectively on a 7-point scale ranging from 1 (implausible) to 7 (plausible). Details, complications and verifiable sources were counted objectively through their frequency of occurrence. We note that strictly speaking these measurements are still not objective. Someone has to define those variables and they then should be coded according to this definition. However, this coding is more objective than the Likert scale coding used for plausibility. We used the variables as they appeared in the datasets, so no additional coding was carried out for this article. However, some dependent variables were merged for the purpose of this article. See Appendix for more information.
In Deeb, Vrij, Leal, et al. (2020), truth tellers told the truth about a significant event they experienced in the past two years whereas lie tellers pretended to have experienced a similar event. In Leal et al. (2019) and Vrij, Leal, Fisher, et al. (2020), truth tellers told the truth about a trip they made in the last 12 months, whereas lie tellers pretended to have made such a trip. In Leal et al. (2015), truth tellers discussed a truthful experience of a theft, loss or damage, whereas lie tellers fabricated such experiences. In Vrij, Leal, Deeb, et al. (2020), truth tellers told the truth about a trip they were going to make (intended trip), whereas lie tellers made up such a story. The number of participants in the studies were 243 in Deeb, Vrij, Leal, et al. (2020), 83 in Leal et al. (2015), 150 in Leal et al. (2019), 208 in Vrij, Leal, Deeb, et al. (2020), and 201 in Vrij, Leal, Fisher, et al. (2020). In all five experiments, manipulations other than veracity took place and they were included as covariates in the current analyses (see Appendix). In addition, Vrij, Leal, Fisher, et al.'s (2020) experiment was carried out in three different countries and 'country' was also included as a covariate in the analyses.
Results
Plausibility as a Diagnostic Verbal Cue to Veracity
Table 1, final column, shows that plausibility could be measured reliably in all five studies. For each of the five datasets, analyses of covariance were conducted with veracity as the independent variable, manipulations other than veracity as covariates, and details, complications, verifiable sources, and plausibility as dependent variables. Table 1 shows the results for the five experiments. In all five experiments, truth tellers' statements came across as significantly more plausible than lie tellers' statements. The effect sizes for plausibility were large in all experiments and ranged from d = 0.71 to d = 1.18. In all five experiments, truth tellers reported significantly more complications than lie tellers, and the effect sizes ranged from small (d = 0.35) to large (d = 0.88). In four of the five experiments, truth tellers reported significantly more details than lie tellers, and the effect sizes ranged from small (d = 0.33) to medium/large (d = 0.59). Also, truth tellers reported more verifiable sources than lie tellers but this effect was significant in only one study (p = .066 in the other study), and the effect sizes were small (d = 0.31) to large (d = 0.78). Taken together, the findings for plausibility and complications were the most consistent across the five experiments, and plausibility emerged as the most diagnostic cue to predict veracity (largest d-scores).
Table 1 shows that in all five studies, moderate correlations emerged between plausibility and the remaining variables and all in the expected direction: increased plausibility was correlated with increased numbers of details, complications, and verifiable sources.
Note. 1Two-tailed correlational tests were carried out between plausibility and the other variables (details, complications, verifiable sources) as the hypotheses were exploratory. We report Pearson's correlation coefficient r and the corresponding confidence intervals for each verbal cue.
2In Leal et al. (2015) interrater reliability of plausibility was measured through five judges. The statistic represents Cronbach's alpha. In the other studies interrrater reliability for plausability and all other variables were measured through two judges and the statistic represents intraclass correlation coefficient (ICC) using the two-way random effects model measuring consistency.
Verbal Cues that Predict Plausibility
Table 2 shows the results from the linear regression analyses for all five experiments. A forced entry method was used with details, complications, and verifiable sources as predictors and plausibility as the outcome variable. When verifiable sources were not included as predictors in the analyses, complications and details explained 25% to 48% of the variance. Complications contributed more than details to the model in two experiments (Leal et al., 2019; Leal et al., 2015). In Vrij, Leal, Deeb, et al. (2020), details (ß = .33, p < .001) contributed more than complications (ß= .30, p < .001) to the model, but this difference was negligible. In the remaining two studies (Deeb, Vrij, Leal, et al., 2020; Vrij, Leal, Fisher, et al., 2020), only complications contributed to the model.
When verifiable sources was included as a predictor in the regression, 41% to 60% of the variance was explained. Verifiable sources contributed to explaining the model's variance, either more than details or more than both details and complications. These results demonstrate that complications and verifiable sources better predict plausibility than details.
Discussion
Plausibility was positively correlated with details, complications, and verifiable sources but was mostly predicted by complications and verifiable sources. These cues explained 37.29% of the variance (average of the seven R2 reported in Table 2), which means that we succeeded to some extent in making the concept plausibility more objective. However, it also means that the remaining 62.71% should be explained by other cues. We believe that contextual information about what is the convention in a given situation may account for at least some of the unexplained variance as research has shown (Blair et al., 2010; Masip & Herrero, 2015). To take an example from one of our own datasets, a businessman travelling from Tokyo to Barcelona pretended to go to Barcelona for a weekend break. He gave a detailed account of which attractions he was going to visit in Barcelona and where he would stay; he provided complications on the planning phase of the trip, and he could present his hotel reservation as evidence. Despite this, his story did not seem plausible because he said he would stay in Barcelona for less than 48 hours. A return trip Tokyo-Barcelona for less than 48 hours just for sightseeing sounds implausible. In a similar vein is the story of the two Russian men suspected of poisoning a former Russian military officer and double agent for the UK intelligence services in Salisbury in England (Roth & Dodd, 2018). They said they travelled from Moscow to the UK for a 43 hours trip to visit the Salisbury cathedral. That is an odd purpose for a trip from Moscow to the UK, even more so because they stayed in a London hotel. Why not staying in a Salisbury hotel if that was their final destination? Their story did not seem plausible even if they would have given many details, complications, and verifiable sources in their interview.
Using total details as a possible predictor for plausibility may be another reason as to why a substantial amount of variance remained unexplained. Total details is a rough measure that gives all details equal weight. In reality some details may be more important to explain plausibility than others. This would resemble the Model Statement findings. A Model Statement is an example of a detailed account unrelated to the topic of investigation (Leal et al., 2015). It raises expectations amongst both truth tellers and lie tellers to provide more information (Ewens et al., 2016). As a result, total details does not discriminate truth tellers from lie tellers after exposure to a Model Statement (Vrij et al., 2018). However, rather than the quantity of details it is the quality of details that discriminates truth tellers from lie tellers after being exposed to a Model Statement. For example, differences between truth tellers and lie tellers arise in reporting core or peripheral details (Leal et al., 2018) and in reporting complications (Deeb, Vrij, & Leal, 2020; Vrij, Leal et al., 2017). This quantity versus quality of detail argument may also influence plausibility ratings. The distinction between core and peripheral details may also be relevant for plausibility ratings, and perhaps statements that focus on core information are considered to be more plausible. In addition, the verbal deception literature contains a wealth of details (other than complications) that discriminate truth tellers from lie tellers (Amado et al., 2016). Researchers could start examining such details.
Statement plausibility was a diagnostic cue to veracity in all five experiments, and it showed larger effect sizes than the other three objectively assessed verbal cues: details, complications, and verifiable sources. A relatively strong performance from plausibility in discriminating between truth tellers and lie tellers was also found in Sporer et al. (2020) and in DePaulo et al.'s (2003) meta-analysis. That plausibility is predicted by multiple cues (complications, verifiable sources, and probably also contextual information) may explain why it showed the largest effect sizes. Assessing statements based on a combination of diagnostic cues (complications and verifiable sources in the current research) is more likely to enhance lie detection than assessments based on individual cues (DePaulo et al., 2003; DePaulo & Morris, 2004; Hartwig & Bond, 2014). Of course, we cannot rule out that even more cues, not examined in the current five experiments, contribute to plausibility ratings. It probably is a verbal cue that consists of more components than many other verbal cues.
The strong performance of plausibility in distinguishing truth tellers from lie tellers makes it an attractive verbal cue. In addition, given how difficult and time-consuming it is to count objective cues such as details, complications, and verifiable sources, rating statement plausibility on a Likert scale may save time as well as cognitive resources. This is crucial for investigative practitioners who are frequently under pressure to resolve cases rapidly (Horgan, 2014). The question arises whether the subjective nature of plausibility is worth a price paying. We think that at present using plausibility as a veracity assessment is premature and advocate against its use. However, we think that plausibility deserves more attention from researchers than it currently attracts.
In terms of research, first, we encourage deception researchers to start including plausibility as a cue in their research to further test its diagnostic value but also to examine which objective cues can explain plausibility. The latter results could lead to a more objective way to measure statement plausibility. Second, in the five experiments discussed in this article, plausibility was always defined as "how likely it is that the activities happened in the way described". Research could examine whether providing observers with different definitions of plausibility would lead to different results. For example, would the definition "how likely it is that the category of activity that is described in this statement generally happens in the way described" lead to different results? Based on the current findings the definition "how likely the overall statement includes complications and verifiable sources given the context" is worth examining.
Third, the five experiments in this paper used samples of college students or community members. It may be useful to examine plausibility among forensic suspects. Suspects and inmates typically do not provide detailed statements and prefer to keep their stories simple, but at the same time, they strive to sound plausible (Alison et al., 2014; Strömwall & Willén, 2011). Suspects also have more insight into people's beliefs about deception and may use countermeasures effectively to mimic truth tellers' responses, but they are not necessarily successful in all their attempts (Deeb et al., 2018; Granhag et al., 2004; Rosenfeld, 2018; Vrij, Leal, Fisher, et al., 2020). For example, asking lie tellers to provide verifiable details does not make lie tellers more forthcoming with respect to verifiable information as that may incriminate them (Nahari et al., 2014). Future research could examine how successful real suspects are when instructed (or not) to provide plausible statements, which we have shown to be partially based on verifiable information.
Fourth, future research could examine true statements that attract low plausibility ratings and false statements that attract high plausibility ratings. Is there something beyond different types of detail that triggers those incorrect plausibility ratings? For example, are rare events seen as implausible regardless of their veracity? And are statements that are considered to be against someone's self-interest seen as implausible regardless of their veracity?
We think there is a large set of questions to be examined in relation to plausibility and that it is worthwhile to pursue them given that plausibility seems to be a relatively strong veracity indicator and practitioners frequently ask questions about it. We hope that this article will start research and discussions about the relationship between plausibility and veracity.