Chapter 8 Discussion and perspectives

everal lines of research have suggested that inner speech may recruit speech motor processes. This work includes introspective and phenomenological studies, mental chronometry studies, motor interference studies, modelling work as well as neurophysiological and psychophysiological studies (cf. our short historical review in Chapter 1). However, the involvement of motor processes during inner speech is highly variable between individuals, tasks, and studies. Therefore, it seems reasonable to assume that inner speech comes in different varieties that may involve speech motor processes to a variable extent. We tested this assumption by examining the involvement of the speech motor system during induced rumination, a negative and repetitive form of inner speech. In addition to shedding light upon the nature of inner speech, this work may offer new theoretical and experimental tools to assess the presence and persistence of ruminative thoughts.

8.1 Summary of the results

As argued in Chapter 1, the guiding assumption underlying this work was that (verbal) rumination may be considered as a form of inner speech, defined as the silent production of speech in one’s mind. Honouring that assumption, we studied induced rumination with the tools and methods used to investigate the phenomenon of inner speech. In the first experimental chapter (Chapter 3), we used surface EMG to assess the predictions of two competing views of inner speech production. According to the motor simulation view, inner speech would be similar to overt speech (i.e., it would include full phonological specification and articulatory planning), except that final execution of the speech actions is inhibited. Therefore, if articulatory motor planning is part of inner speech, and if partially inhibited motor commands are sent to the speech apparatus, it should be possible to record peripheral muscular activation in the speech muscles during inner speech. According to the abstraction view, the level of truncation between overt speech and inner speech would be located higher in the production process, in the sense that inner speech would not include articulatory features. Therefore, under this view, it is not expected to record peripheral muscular activation in the speech muscles during inner speech. We observed that the induction of rumination was accompanied with increased EMG amplitude in all facial muscles (i.e., orbicularis oris inferior, orbicularis oris superior and frontalis) as compared to rest. It should be noted that baseline recordings were performed after a relaxation session. We interpreted these findings as a corroboration of the motor simulation view, with the activity of the speech muscles increasing from baseline to after the induction, suggesting that rumination, as a form of inner speech, may involve the speech motor system. We interpreted the increased activation of the forehead muscle as a consequence of the negative content of rumination, as the frontalis muscle is known to be associated with the expression of anger and sadness. Additionally, in the second part of this experiment, we observed that a relaxation focused on the speech muscles was slightly more efficient than a relaxation focused on the arm in reducing self-reported state rumination. However, a few important limitations are worth keeping in mind when considering these results. First, as we did not have any control group for the rumination induction, it is problematic to attribute the observed effects to the rumination induction only.57 Second, and related to our first point, the dissociation between the activation of the lip muscles and the forehead muscle may not be that straightforward. Indeed, it might be that the activation of both sites was related to rumination as a form of inner speech, or that the activity of all facial muscles was related to negative affects only. This point echoes and strengthens the recommendations made by Garrity (1977) and discussed in Box .

To tackle these limitations, in Chapter 4, we sought to examine the differences between different forms of induced rumination, that should theoretically involve the speech motor system to a different extent. We compared the EMG orofacial correlates of either verbal rumination or non-verbal (visual) rumination. Unfortunately, self-reports of the modal content of the ruminative thoughts showed that our induction did not succeed in inducing rumination in different modalities. However, even when exploring the (a posteriori) relation between the modality of the ruminative thoughts and the facial EMG correlates, we failed to find the predicted relation. Put simply, verbal rumination was not associated with more activity in the speech muscles than visual-dominant rumination. Moreover, comparing two types of relaxation (as in Chapter 3) revealed that, in contrast to previous results, the arm relaxation was slightly more efficient than the orofacial relaxation in reducing state rumination. Averaging the relaxation results from these two first studies revealed that both relaxation types have a similar effect on state rumination. These results therefore suggest that verbal rumination is not specifically accompanied with peripheral muscular activity in the speech muscles, compared with visual-dominant rumination. However, it is unclear whether this result is due to a poor sensitivity of the surface EMG measurements or to the fact that rumination is a form of inner speech that does not involve the speech motor system.

In Chapter 5, we sought to resolve that ambiguity by examining the reliability of our EMG measurements to detect peripheral muscular activity during inner speech production. To this end, we asked participants to produce two lists of nonwords, that were designed to induce either a strong activation of the lip muscles or a strong activation of the zygomaticus major muscle. We recorded the EMG amplitude of several facial muscles (including the orbicularis oris inferior and the zygomaticus major) during the production of these nonwords in inner speech, overt speech, and during the listening of these nonwords. First, contrary to expectations, even in the overt mode, nonwords containing spread lip phonemes (e.g., /i/) did not result in more EMG activity in the zygomaticus major region than nonwords without spread phonemes. Similarly, nonwords with lip protrusion did not result in more EMG activity in the orbicularis oris inferior region than non-rounded nonwords. This finding suggests that surface EMG is not precise enough to obtain direct muscle activity. Based on previous results in the literature, we hypothesised that surface EMG may be used to discriminate the content (here, the class of nonword) produced in inner speech. However, an automatic classification revealed that although we were able to discriminate content produced in overt speech, we were not able to discriminate the content produced in inner speech based on surface EMG measurements. This result stands in contrast with previous historical results but also with more recent results obtained by other teams. However, crucial differences between other studies and ours include differences in the material used (e.g., surface vs. intramuscular recordings), the population (e.g., children vs. adults), or the general methodology (e.g., hypothesis testing vs. classification and optimisation). Despite this surprising result and the failure of the surface EMG methodology to “decode” the content of inner speech, the abundance of positive results in the literature still speaks in favour of the peripheral muscular components of inner speech and that it is possible to assess it using surface EMG. In order to avoid the potential limitations of EMG recordings, we shifted in the second part of the present work to another strategy in examining the role of motor processes in rumination. More precisely, instead of recording peripheral muscular activation of induced rumination, we tried to directly interfere with the speech motor system to check whether this would affect verbal rumination.

In Chapter 6, we set up a critical test of the motor simulation view of inner speech (and rumination). Indeed, if the involvement of the speech motor system is necessary during inner speech and rumination, then a disruption of the speech motor system should disrupt (or impair) the production of inner speech (and rumination). To examine this idea, we compared the effects of an articulatory suppression task to a finger-tapping task, following a rumination induction, on the levels of self-reported state rumination. Our results suggest that self-reported state rumination decrease after both motor activities, with only a slightly stronger decrease following articulatory suppression, suggesting that rumination would not be a form of inner speech that would crucially depend on the activity of the speech motor system (it does not mean that the speech motor system may never be involved in rumination, only that its involvement may not be necessary). However, some important limitations make the interpretation of these results delicate. First, there were important differences between the two groups (i.e., articulatory suppression vs. finger-tapping groups) at baseline, possibly due do the rhythmic training proposed before baseline measurements. Second, the measure of state rumination consisted in one single non-validated scale (already used in Chapter 3) and may not be a reliable index of state rumination. Third, as in Chapter 3, there was no control group to the rumination induction and any effect following to the rumination induction may not be specifically attributable to the rumination induction. Fourth, there is some evidence suggesting that finger-tapping (with the dominant hand) may also perturb speech motor planning and may therefore not be the best control condition to articulatory suppression. We sought to overcome these limitations in the last empirical chapter.

Finally, in Chapter 7, we extended the experiment from Chapter 6 by comparing the effects of articulatory suppression (vs. finger-tapping) on rumination and problem-solving, another (more adaptive) form of repetitive thinking. To overcome the limitations of the previous experiment, we used a validated scale of state rumination and asked our participants to use their non-dominant forearm in the finger-tapping condition. We also made sure that our baseline measurements were not contaminated by any systematic effect. The data collection for this experiment is still ongoing and given the very low sample size (around 10 participants per group at the time of writing), we will not consider these results further in the present discussion. However, preliminary analyses presented in the results section of Chapter 7 suggest that articulatory suppression may indeed interfere with induced rumination.

8.2 Theoretical implications of the results

8.2.1 Epistemological interlude

In order to fully apprehend the theoretical implications of these results, it might be useful to first clearly articulate the logical argument elaborated throughout the present work. In the first part (the EMG studies presented in Chapter 3 and 4), the logical argument was as follows: if verbal rumination is a form of inner speech, then rumination should be accompanied by peripheral muscular activity in the speech muscles. However, going from the substantive hypothesis (verbal rumination is a form of inner speech) to the experimental prediction (i.e., connecting theory to observations) actually requires the use of auxiliary hypotheses or assumptions. Elucidating these auxiliary assumptions, the actual logical argument from the first part can be restated as follows:58

  • Theoretical assumption (\(T\)): Verbal rumination is a form of inner speech

    • Auxiliary hypothesis 1 (\(A_{1}\)): Some forms of inner speech involve the motor simulation of speech production
    • Auxiliary hypothesis 2 (\(A_{2}\)): The simulation mechanism recruits neural networks engaged in (overt) execution
    • Auxiliary hypothesis 3 (\(A_{3}\)): The motor commands generated during simulation are only partially inhibited
    • Instrumental hypothesis 1 (\(I_{1}\)): Surface electromyography is a reliable tool to peripherally record partially inhibited motor commands
    • Ceteris paribus clause (\(C_{p}\)): We assume there is no other factor exerting an appreciable influence that could obfuscate the main effect of interest
  • Prediction: Induced rumination should be accompanied by peripheral muscular activity (EMG traces) in the speech muscles

In other words, we say that if the ensemble of premises \(p\) (i.e., the conjunction of the theoretical assumption, auxiliary hypotheses, etc.) is true, it should follow that \(q\) is true. Therefore, stating \(p\) suffices to conclude \(q\) (modus ponens), that is, \(p\) entails \(q\). To be even more precise, when we test a theory predicting that if \(O_{1}\) (some experimental manipulation or predictor variable), then \(O_{2}\) (some observation or measured variable), what we actually say is that this relation holds if and only if all the conjuncts above are true. Thus, the logical structure of an empirical test of a theory can be described as the following conceptual formula (e.g., Meehl, 1990, 1997):

\[ (T \land A_{1} \land A_2 \land A_{3} \land I_{1} \land C_{p} \land C_{n}) \to (O_{1} \supset O_{2}) \]

where the “\(\land\)” are conjunctions (“and”), the arrow “\(\to\)” denotes deduction (“follows that …”), and the horseshoe “\(\supset\)” is the material conditional (“If \(O_{1}\), Then \(O_{2}\)”). \(A_{t}\) is a conjunction of auxiliary theories, \(C_{p}\) is a ceteribus paribus clause (i.e., we assume there is no other factor exerting an appreciable influence that could obfuscate the main effect of interest), \(I_{1}\) is an auxiliary theory regarding instruments, and \(C_{n}\) is a statement about experimentally realised conditions (i.e., we assume that there is no systematic error/noise in the experimental settings).

In other words, we imply that a conjunction of all the elements on the left-side (including our substantive theory \(T\)) does imply the right side of the arrow, that is, “if \(O1\), then \(O2\)”. From there, observing \(q\) (where \(q\) represents the right-side of the above formula) does not allow inferring \(p\) (affirming the consequent fallacy) but not observing \(q\) (\(\lnot q\)) allows inferring not \(p\) (\(\lnot p\)) via the modus tollens. However, not observing \(q\) does not permit to refute the substantive hypothesis \(T\) alone. Rather, not observing \(q\) only allows for the refutation of \(p\), the conjunction of all elements described above (i.e., \(T \land A_{1} \land A_2 \land A_{3} \land I_{1} \land C_{p} \land C_{n}\)). Put formally, negating the conjunction is logically equivalent to stating a disjunction of the conjuncts (i.e., either one or the other of the conjuncts is false; Meehl, 1990). Therefore, not observing \(q\) only allows for a refutation of \(p\) to an extent that is function of the (im)plausibility of the other conjuncts in \(p\) (i.e., \(A_{1}\), \(A_{2}\), \(A_{3}\), \(I_{1}\), \(C_{p}\) and \(C_{n}\)). To sum up, failing to observe a predicted outcome does not necessarily mean that the theory itself is wrong, but rather that the conjunction of the theory and the underlying assumptions at hand are invalid (Lakatos, 1976; Meehl, 1990, 1997).

Similarly, the logical argument from the second part (i.e., the relaxation experiments presented in Chapter 3 and 4 as well as the articulatory suppression studies presented in Chapter 6 and 7) was of the following form: if verbal rumination is a form of inner speech, then, a disruption of the speech motor system should disrupt rumination. Again, this argument may be restated in a more detailed form as follows:

  • Theoretical assumption (\(T\)): Verbal rumination is a form of inner speech

    • Auxiliary hypothesis 1 (\(A_{1}\)): Some forms of inner speech involve the motor simulation of speech production
    • Auxiliary hypothesis 2 (\(A_{2}\)): The simulation mechanism recruits neural networks engaged in (overt) execution
    • Ceteris paribus clause (\(C_{p}\)): We assume there is no other factor exerting an appreciable influence that could obfuscate the main effect of interest
  • Prediction: A disruption of the speech motor system should disrupt rumination

In other words, using the same reasoning as before, we say that not observing \(q\) only counts as a refutation of \(T\) to an extent that is function of the (im)plausibility of the other conjuncts in \(p\). The question remains to know how we could assess the plausibility of each conjunct in order to examine the validity of the substantive hypothesis. Interestingly, Strevens (2001) discusses a Bayesian solution to this problem (known as the Duhem-Quine problem in philosophy of science). Reformulating the problem as one of assigning “credit or blame to central hypotheses vs. auxiliary hypotheses” (Gershman, 2019), Strevens suggests a Bayesian framework for confirmation. Let \(h\) denotes the substantive hypothesis, \(a\) denotes the auxiliary hypothesis (the reasoning can be generalised to multiple auxiliary hypotheses), and \(d\) denote the data. After observing the data \(d\), the prior probability of the conjunct \(ha\) (i.e., \(p(ha)\)) is updated to the posterior distribution \(p(ha|d)\) according to Bayes’ rule:

\[ P(h a | d) = \frac{P(d | h a) P(h a)}{P(d | h a) P(h a)+P(d | \neg(h a)) P(\neg(h a))}, \]

where \(p(d|ha)\) is the likelihood of the data under \(ha\), and \(\lnot (ha)\) denotes the negation of \(ha\). From there, marginalising over all possible auxiliary hypotheses, the sum rule of probability allows us to obtain the updated belief about the substantive hypothesis:

\[P(h | d) = P(h a | d) + P(h \neg a | d).\] Similarly, the marginal posterior over the auxiliary is given by:

\[P(a | d) = P(h a | d) + P(\neg h a | d).\] To sum up, although failing to observe an outcome predicted by a substantive theory cannot count as a strict falsification of that theory, a Bayesian confirmationist framework permits to assess the plausibility of each conjunct separately and to guide the rational updating of knowledge in the light of incoming data (for more details, see Gershman, 2019; Strevens, 2001). In the next section, we revisit our results, keeping these concepts in mind, in order to assess the plausibility of each conjunct and the evolution of these plausibilities throughout the data we accumulated in our work.

8.2.2 Re-reading our results

What is the role of speech motor processes in rumination? This question can be reframed as follows: what is the role of speech motor processes in inner speech production and how does this role vary across the different varieties of inner speech? Are some forms of inner speech always/never motoric? As suggested in the previous discussion, assessing such theoretical issues requires considering and weighing the plausibility of auxiliary assumptions used to connect theoretical statements to empirical predictions. One of the major assumptions we have made in the first part of the present work (the EMG studies) was that surface EMG was a reliable way of examining inner speech production. What is the plausibility of that assumption?

As reviewed in Chapter 1, many studies have shown that it is possible to use (both surface and intramuscular) electromyography to “decode” the content of inner speech, although some studies failed to do so. However, results from EMG studies of inner speech come in different flavour of persuasiveness depending on the strictness of their experimental protocol. As discussed previously, the most convincing studies are the one showing muscle-specific EMG correlates of inner speech production (e.g., McGuigan & Dollins, 1989; McGuigan & Winstead, 1974). Most of these studies involve multiple recordings per participant and per stimulus, providing a high sensitivity to these EMG measurements. In contrast to these studies, least convincing studies include the ones showing a general (i.e., non-specific) increase in facial muscular activity from rest to the condition of interest, as this increase may be due to many other factors than inner speech production per se (cf. our discussion in the last section but also in Chapter 1 and Chapter 5). The latter type of study usually focuses on more ecological occurrences of inner speech, such as the production of fully formed sentences, poem recitation, or the occurrence of maladaptive forms of inner speech (e.g., AVHs or rumination). One crucial difference between EMG studies of lower-level inner speech studies (like the one we carried out in Chapter 5) and the EMG studies of more naturalistic forms of inner speech (like the one we carried out in Chapter 3) is the ability to repeat measurements. Indeed, whereas it is relatively easy to obtain several repetitions of multiple vowels or syllables for a given pool of participant, it is experimentally more arduous to repeat the measurement of more complex forms of inner speech such as AVHs or rumination. As a consequence, and given that the sensitivity of our EMG measures was already insufficient to decode inner speech in the experiment reported in Chapter 5, it might be that the sensitivity of surface EMG was also too low to detect the presence of rumination as a form of inner speech (as observed in Chapter 3 and Chapter 4). Importantly, we mean that the sensitivity of surface EMG is too low to detect the presence of rumination as a form of inner speech. It does not mean that surface EMG cannot be used to assess the presence of rumination (e.g., focusing on the activity of the frontalis or any other facial muscle), only that the changes in EMG amplitude can not be attributed to speech motor processes per se (because they seem not to be muscle-specific and not to be specific to the verbal content of rumination). To sum up, although surface EMG measurements may be used to assess the content of inner speech production (in sufficiently well powered experimental designs), it might not be used to assess the presence of more naturalistic and uniquely (i.e., on a single occasion) occurring forms of inner speech. To put it in other words, although our instrumental assumption \(I_{1}\) about the reliability of EMG measurements may be valid in well-powered designs, it may not be valid in EMG studies of rumination (which may impede the ability to test our substantive hypotheses regarding the role of motor processes in inner speech and rumination). We shall now examine the results from the second part of the thesis, in which we directly tried to interfere with the speech motor system during rumination.

Results from this second part include the results from the relaxation experiments reported in Chapter 3 and 4 as well as the articulatory suppression study presented in Chapter 6 (as discussed previously, we will not consider the preliminary results from Chapter 7). As discussed in section 8.1, the combined results from the relaxation experiments presented in Chapter 3 and 4 suggest that a relaxation focused on the orofacial area was not more efficient than a relaxation focused on the non-orofacial (brachial) area. This observation is interesting in many ways. First, it highlights the need for replication. Indeed, based on the results from Chapter 3 only, we would have concluded that state rumination could be reduced via targeted relaxation. Reciprocally, based on the results from Chapter 4 only, we would have concluded that state rumination could be reduced by relaxation focused on the arm. It is only the combined consideration of both results that allowed us to observe this null effect. Second, it shows that although relaxation may decrease state rumination, the effect of relaxation is not specific to speech motor processes (both types of relaxation were equally effective, on average, in reducing state rumination).59 These results suggest that the activity of the speech motor system is not necessary for experiencing rumination (it does not mean that the activity of the speech motor system does not play a role at all), as state rumination was not differentially affected by manipulation of the facial vs. body motor system. This observation is coherent with previous results, showing that a “passive” peripheral disruption of the speech motor system (e.g., via anaesthesia) does not disrupt inner speech. In complement to these experiments, we used articulatory suppression in Chapter 6 to directly interfere with the speech motor system during rumination. This operationalisation may differ from the relaxation experiment in the sense that it requires the participant to actively plan speech motor actions. Therefore, we consider this as an “active” peripheral disruption of the speech motor system. Results from this study are difficult to interpret however, for the reasons already mentioned in Chapter 6 and section 8.1. However, these results suggest that articulatory suppression was only slightly more efficient in reducing state rumination than finger-tapping. This results again corroborates the idea that the activity of the speech motor system is not necessary for experiencing rumination (although this conclusion should be further confirmed or contradicted by the data collected in the experiment from Chapter 7).

Overall, these results suggest that rumination does not necessitate the activity of the speech motor system (as it seemed not to be associated with specific activity in the speech muscles and as it seemed not to be more strongly affected by articulatory suppression than manual suppression). In the next sections, we discuss the implications of these results for theories of inner speech and rumination and suggest ways forward from an experimental perspective.

8.2.3 Implication of these results for inner speech theories

How could it be that some forms of inner speech involve the speech motor system to an extent that is quantifiable using surface EMG whereas some others forms (e.g., rumination) do not? This question can be approached from different perspectives and at different levels of explanation. As discussed in Chapter 1, Vygotsky’s model of inner speech development and Fernyhough (2004)’s extended four-level model suggest that inner speech may be expressed with different degrees of “externalisation”, from condensed inner speech to expanded inner speech. These forms of inner speech are situated on a continuum and it seems legitimate to assume that more expanded forms of inner speech recruit the speech motor system to a greater extent than more condensed forms of inner speech. This idea is supported by many studies showing a progressive externalisation of inner speech under cognitively demanding situations (e.g., Sokolov, 1972). However, these models do not stipulate how the involvement of the speech motor system is regulated. What mechanism(s) may explain the differences in the degree of involvement of the speech motor system during inner speech production?

As discussed previously, Sokolov (1972) also observed that the “externalisation” of inner speech60 was a function of the novelty of the task and of the degree of automaticity. How could the difficulty, novelty, and automaticity of the task influence the externalisation of inner speech? We already outlined some possible answers to this question in Chapter 1. According to Cohen (1986), the presence of motor activity during inner speech may be interpreted in terms of attentional sharing. For instance, cognitively demanding situations (e.g., novel or difficult tasks) arguably require greater amount of attention to be performed. In these situations, the vividness of inner speech percepts could be strengthened by increasing the speech motor activity, resulting in more salient auditory percepts. Alternatively, the greater externalisation of inner speech in cognitively demanding tasks may be restated in the motor control framework by postulating that lower amount of inhibition will be applied to block motor commands during inner speech, resulting in higher levels of motor activity (and also arguably more vivid inner speech percepts). These two explanations are not incompatible and as discussed previously, the modulation of the amount of inhibition provides a mechanism through which inner speech percepts are reinforced (or not).

How does this fit with our results and with rumination more specifically? As we will argue, rumination can be considered as a mental habit, that is, a mental process that became automatic by repetition (cf. our more detailed discussion in the next section). As discussed in section 1.2.3, the peripheral muscular activation often observed during motor imagery and inner speech may be attributed to (the consequences of) partially inhibited motor commands (i.e., to residual movements). Therefore, variations in the amount of peripheral muscular activity recorded during inner speech production may be attributed to variation in the amount of inhibition applied to motor commands issued during inner speech production. Thus, rumination, may be considered as a strongly internalised form of inner speech that does not recruit the speech motor system (in other words, the motor commands that emitted during this form of inner speech are greatly inhibited). However, it is still unclear what exactly these inhibitory mechanisms are (e.g., what kind of inhibition are we talking about, MacLeod, 2007; how and when these mechanisms are implemented, cf. Guillot et al., 2012a) and elucidating the inhibitory mechanisms underlying inner speech production will be the focus of one of our future research projects.

Another possibility is that more automatic forms of inner speech may rely more on associate memory-based processes whereas less automatic (more intentional or deliberate) forms of inner speech may rely more on simulation (or emulation) mechanisms (cf. the model proposed in Lœvenbruck et al., 2018). Why would that be the case? Automatic forms of inner speech (e.g., poem recitation)–but also more general forms of motor habits– are developed through the repeated learning of the association between motor commands and the sensory consequences of these motor commands. Through repeated learning, these actions become automatic. By automatic, we mean that these actions i) can be executed without awareness of the action being executed, ii) can be initiated without awareness or deliberate attention, iii) can be evoked automatically by stimuli in the environment, without deliberately orienting the attention to it, and iv) are said to be automatic if they can be performed without interfering with other tasks (Norman & Shallice, 1986). In contrast, we may speculate that novel (unusual) motor actions, to be imagined, need to go through the simulation/emulation mechanism. This idea could be tested experimentally by creating habits (via learning) of different words and comparing their EMG traces or by assessing their “suppressibility” by articulatory suppression (see for instance Saeki, Baddeley, Hitch, & Saito, 2013). In other words, the motor imagery (or inner speech) of novel versus known material would be underpinned by different processes that would involve the motor system to a different extent. This distinction echoes Pickering & Garrod (2013)’s distinction between the prediction-by-association and prediction-by-simulation mechanisms in language perception and comprehension. They suggested that the prediction-by-association mechanism relies more on perceptual sensory experiences and domain-general cognitive abilities (such as memory) whereas the prediction-by-simulation mechanism would rely more simulation of the motor action leading to the speech auditory percept. These two mechanisms may be used conjointly and weighed differently according to the task that is performed. The question remains to know how this weighting is performed. We may speculate that for each task to be performed, an astute test is first performed (for instance based on familiarity), in order to determine whether the action to be performed is novel or not, and whether its consequences should be retrieved from memory (or inferred via associative mechanisms) or whether they should be simulated/emulated. In the former case, no peripheral muscular activity is expected, whereas in the latter case, the speech motor system would be involved in simulating/emulating the corresponding overt action. For simulated/emulated actions, the motor consequences of partially inhibited motor commands (i.e., small residual movements) could be recorded peripherally using surface electromyography (cf. also the motor simulation vs. direct simulation (memory retrieval) distinction in Tian & Poeppel, 2012).

Moreover, as discussed in section 1.2.2, there is currently a debate as to the best architecture to model the control of motor actions, and this debate could be extended to inner speech production. More precisely, Pickering & Clark (2014) made a distinction between two types of architectures, differing by the place forward models play in these architectures. First, in auxiliary forward models (AFM), forward models are considered as “special-purpose prediction mechanisms implemented by additional circuitry distinct from core mechanisms of perception and action”. Second, in integrated forward models (IFM), forward models “lie at the heart of all forms of perception and action” (Pickering & Clark, 2014). In other words, forward models are thought to be additional internal models specifically developed for the purpose of emulating motor actions (AFM) or the emulation and prediction function is thought to be realised by the same mechanisms that handle the production of motor actions (IFM). Relatedly, Friston (2011) argued for an IFM architecture and showed how motor control can be formalised in a Bayesian predictive framework, where optimal control can be seen as (active) inference. In these models, there would be no need for an inverse model, because the inverse model can be replaced by a Bayesian inversion of the forward model. According to Friston (2011), “Active inference eschews the hard inverse problem by replacing optimal control signals that specify muscle movements (in an intrinsic frame) with prior beliefs about limb trajectories (in an extrinsic frame)” (p.491). In this kind of model, motor commands are replaced by top-down (proprioceptive) predictions that drive the adjustment of the motor plant (i.e., that produce movements). This idea is similar to perceptual inference in sensory cortices, where descending connections convey predictions whereas ascending connections convey prediction errors (Adams, Shipp, & Friston, 2013). These descending signals are themselves predictions of proprioceptive consequences and may therefore play the role of a corollary discharge (without resorting to an inverse model). This is an interesting proposal, as most of the evidence supporting the role of an efference copy during inner speech production (e.g., Ford & Mathalon, 2004; Tian et al., 2018, 2016; Tian & Poeppel, 2010, 2012; Whitford et al., 2017) is actually evidence for the presence of a corollary discharge (leading to sensory attenuation) more than evidence for an efference copy per se. The idea that an inverse model may not be necessary for modelling and explaining inner speech production is also found in Wilkinson & Fernyhough (2017), who suggested that inner speech production could be modelled in a predictive processing framework (for an introduction, see Clark, 2013). The need for an inverse model (or not) might be assessed in several ways. For instance, Pickering & Clark (2014) suggested to look for double dissociations. Indeed, if there are distinct forward and inverse models, lesions (patient or temporary lesion studies) to the forward model should disrupt the ability to correct movements online or to learn new movements, but it should not prevent movements to be executed. In contrast, in an IFM account, all these abilities should be disrupted by a lesion to the forward (generative) model. Such an empirical test would be crucial in deciding between AFM and IFM architectures and might lead to a revision of current models of motor control.

To sum up this section, our results suggest that some forms of inner speech (e.g., rumination) may not necessitate neither be specifically associated with an activity of the speech motor system. More precisely, because rumination can be considered as a mental habit and be evoked automatically (i.e., it can start without deliberation) by contextual emotional cues (for instance), it may not recruit the speech motor system to the same extent as deliberate inner speech does. We suggested two (non-exclusive) interpretations of these results. First, the amount of inhibition applied to motor commands emitted during inner speech production (and during other forms of motor imagery, more generally) may be modulated by characteristics of the task (e.g., perspective, type of motor imagery, novel or familiar content) and by individual characteristics (e.g., expertise). Second, automatic or “habitual” inner speech would differ from deliberate expanded inner speech in that different processes would underlie their production. The former would rely more on associative memory-based processes (with no or lesser involvement of the speech motor system) whereas the latter would rely more on simulation/emulation mechanisms (with a greater involvement of the speech motor system).

8.2.4 Implication of these results for rumination theories

One of the most noticeable property of ruminative thoughts is their repetitiveness. Everyone knows the feeling of being trapped in a chain of endless recurring thoughts. Moreover, the initiation of rumination is often automatic (cf. our previous discussion of the meaning of automatic in this context). In Chapter 1, we briefly presented the habit-goal framework of depressive rumination introduced in Watkins & Nolen-Hoeksema (2014). This theoretical framework provides an elegant integration of accounts explaining how rumination starts and how it is maintained. It is built on the idea that rumination could be considered as a mental habit (Hertel, 2004). More formally, in conditioning theories, a habit is formed when a response is repetitively associated with a stimulus (and when this association is reinforced). In other words, a stimulus-response (S-R) habit is learnt when some behaviour is contingent on some stimulus. Importantly, habits are automatic behaviours: they lack awareness, they are mentally efficient and are often difficult to control. Moreover, as habits are usually slow to learn, they are also slow to unlearn (i.e., they are relatively stable over time). The habit-goal framework considers rumination as a form of habitual response to goal-state discrepancies that occur frequently and repetitively in the same emotional context (i.e., depressed mood). Put simply, rumination can be considered as a habitual response (behaviour) to an emotional context (depressed mood). Therefore, this framework permits to explain how rumination, while being originally triggered by state-goal discrepancies, might become independent of these goals through repetition of this association. After learning, rumination might simply be “evoked” by contextual cues (e.g., negative mood). This would partially explain why rumination, as a habitual response, is particularly difficult to interrupt. As discussed in the previous section, considering rumination explicitly as a form of habit also permits to explain why rumination can be considered as a form of inner speech that does not involve the speech motor system. Therefore, our results do not contradict the habit-goal framework of depressive rumination.

According to consensual models of inner speech development and production, different forms of inner speech seat on a continuum from condensed inner speech to expanded inner speech (and to overt speech). This continuum is often discretised and discussed in terms of several levels such as “condensed inner speech” and “expanded inner speech” (e.g., Fernyhough, 2004), although condensation might be more precisely described as a continuous dimension (e.g., Grandchamp et al., 2019). As discussed previously, it is usually assumed that more internalised61 forms of inner speech are also more condensed. These forms of inner speech are structurally different from overt speech or expanded inner speech as they contain more abbreviation, are more predicative, and so on (cf. section 1.2.1.2). Therefore, based on our conclusion that verbal rumination is neither necessarily nor specifically associated with activity in the speech muscles, we expect rumination to be similar to other forms of condensed inner speech and to show similar structural properties. How could this hypothesis be examined? It is tempting to look into think-aloud protocols, where participants are asked to think aloud during some task. For instance, Lyubomirsky, Tucker, Caldwell, & Berg (1999) examined the phenomenology of ruminative thoughts by asking participants to “ruminate outloud”. They observed that rumination is generally associated with an overly negative tone, increased self-criticism and self-blame, reduced confidence, optimism, and perceived control. Although this kind of protocol may be suitable to examine the emotional content of “outloud rumination”, it is not appropriate to examine the syntactical properties of rumination, as outloud rumination is expected to differ significantly from silent rumination. Indeed, as highlighted by Fernyhough (2004), classical models of inner speech development and production postulate that the specified levels do not only represent stages of development but also determine possible movements between levels during production. More precisely, externalising inner speech leads to a “re-structuration” of inner speech (or a “reconstruction”, cf. Sokolov, 1972), continuously replacing the properties of condensed inner speech by the properties of overt (private) speech. In other words, it is not possible to observe outloud rumination to examine the properties of silent rumination, as externalising rumination is expected to change the properties of rumination. Another possibility is to rely more on the self-reports of participants trained to identify the syntactical properties of their (ruminative) thoughts (e.g., Hurlburt et al., 2013; Smadja, 2019). Complementary information may also be gathered by combining several sources of neuroimaging and psychophysiological data to identify biological markers of the different forms of inner speech (e.g., Grandchamp et al., 2019).

So far, we have considered rumination as a habit and discussed how this view could account for our results. However, our results are about induced rumination, that is, rumination induced by some experimental manipulation (in contrast to automatically evoked naturally occurring rumination). Does not that contradict the interpretation of our results as a corroboration of the habit-goal framework of depressive rumination? It is possible that the properties of rumination we inferred from our results do not generalise well to naturally occurring rumination. We think there are good reasons to think otherwise, and we suggest that our above discussion also apply to naturally occurring rumination (and not only to induced rumination). For instance, if rumination can be described as a habit, we know that it is generally possible to induce a habit, if only by presenting the appropriate contextual cues (i.e., the cues that usually trigger the habit). Therefore, a rumination induction can be seen as an artificial62 cue created in the lab specifically to trigger rumination. Once rumination has been triggered, we have no reason to think that induced rumination and naturally occurring rumination differ substantially with respect to the properties of habits that we are interested in here (e.g., automaticity).

Having clarified this point, one important question remains. What does viewing rumination as a mental habit entail for psychotherapies targeting rumination? This view suggests that focusing on changing beliefs, attitudes or intentions will not be effective in changing habitual behaviours such as rumination (Watkins & Nolen-Hoeksema, 2014). Instead, Wood & Neal (2007) have proposed to provide patients with “concrete tools for controlling habit cueing” (p.860). In other words, providing tools that may be used to alter or avoid exposure to the cues that trigger rumination. For instance, if rumination is associated with cues that occur in a certain location, changing it is hypothesised to interrupt rumination. However, removing the context in which rumination appears does not change the context-response association leading to rumination. Therefore, as soon as this context reappears (e.g., depressive mood), rumination may reappear. More robust interventions should then target the context-response association itself. Hertel (2004) suggested that the best way to overcome mental habits is not to merely oppose them through controlled procedures (e.g., cognitive control) but to train new habits through controlled practice: “In short, the best antidote to maladaptive habits is a new set of habits” (p. 209). To put it simply, the unhelpful contextual response (e.g., rumination) needs to be replaced with a more helpful response, in order to create a (more adaptive) new habit. Such an intervention would require i) identifying the triggering cue and ii) replacing the unhelpful response (rumination) by a more adaptive and incompatible one (e.g., concrete thinking, relaxation). As suggested by Watkins & Nolen-Hoeksema (2014), these directions suggest that classical cognitive behavioural therapies or cognitive bias modification approaches to change negative biases or challenge thoughts will not be effective unless these strategies are implemented as an alternative to rumination (as a new habit). Overall, considering rumination as a mental habit is conceptually fruitful as it opens new possibilities for the understanding and care of rumination. Future research could relate the vast literature on the computational modelling of habit development maintenance in the brain (e.g., Daw, Niv, & Dayan, 2005; Dolan & Dayan, 2013; FitzGerald, Dolan, & Friston, 2014) with the development and maintenance of rumination.

8.3 Methodological limitations and ways forward

As always, several limitations are worth keeping in mind when reading the present discussion. Most obviously, we restricted ourselves in recruiting samples of undergraduate students in Psychology at Univ. Grenoble Alpes (France) and Ghent University (Belgium). Moreover, as discussed in Chapter 3, we almost exclusively recruited female participants, as they are known to be more prone to rumination than male participants (Johnson & Whisman, 2013). Although these choices facilitated the recruitment of participants, they obviously constrained the generalisability of our findings. Moreover, we only recruited populations of WEIRD participants, that is, western, educated, industrialised, rich and democratic participants (Henrich, Heine, & Norenzayan, 2010), which is a threat to external validity. These issues could be avoided (or reduced) by relying more in future studies on modern large-scale collaboration networks such as StudySwap (Chartier et al., 2016) or the Psychological science accelerator (Moshontz et al., 2018).

In continuation with validity concerns, Flake & Fried (2019) coined the term of questionable measurement practices (echoing the questionable research practices introduced by Simmons, Nelson, & Simonsohn, 2011) to designate the degrees of freedom a researcher has in choosing how to measure a psychological construct of interest (a freedom that may lead to v-hacking, the validity analogous of p-hacking; Hussey & Hughes, 2018). They identify several problematic measurements practices (such as creating on-the-fly scales) and propose a set of questions to recognise and to avoid these practices (e.g., what is your construct? why do you select your measure?). Importantly, they suggest that on-the-fly scales should only be used if these scales underwent validity checks and the researchers report these checks (or the absence thereof). These issues are important for the present work as we used several on-the-fly scales that, admittedly, did not undergo proper psychometric validation checks. For instance, the scales used to assess state rumination in Chapter 3 and 6 were created by our team and did not go through classical validation procedures (the French version of the BSRI used in Chapter 4 was not validated at the time of the study but is currently undergoing a validation procedure). These poor methodological choices may be explained by the lack of satisfactory measures of state rumination at the beginning of the present investigation. This lack has been filled recently with the development of the BSRI (Marchetti et al., 2018), a scale that we used to assess state rumination in our most recent study (discussed in Chapter 7). Although the results of this experiment are preliminary (as data collection is still ongoing), examination of the results from this experiment and of the discrepancies between these results and the results from other experimental chapters may be informative with regards to the validity of our on-the-fly scales.

On a different note, we assumed throughout our work (as it is commonly done in Psychology) that the magnitude of an effect in the sample was the sign (and the best estimate) of the magnitude of a population effect, which may (or may not) correspond to the effect observed on an average individual (which may or may not reflect any particular individual). In other words, we did not consider the intrinsic heterogeneity of the effect.63 However, we know that, on average, the variability of an effect in psychological experiments is often significant. Although this variability may itself vary across sub-fields (which may affect the participants-trials trade-off, cf. Rouder & Haaf, 2018), the effect is nonetheless expected to always vary to a non negligible extent. Besides asking whether an effect exists or not or what the magnitude of the effect is, another potential interesting question is to ask whether all participants in a study show the effect. Haaf & Rouder (2017) and Rouder & Haaf (2018) named this property the dominance of an effect, with dominant effects being effects that we can observe in every participant and non-dominant effects being effects that do not show the same sign in every participant (i.e., some participants will show an effect in some direction whereas some others participants will not show the effect or will show the effect in the other direction). Haaf & Rouder (2017) proposed a method to assess the dominance claim and showed that, for instance, the Stroop effect can generally be considered as a dominant effect. Importantly, a dominant effect is not necessarily an effect that manifests itself in every participant in some given sample. The dominance claim is a claim about the population effect and some participants in a given experiment may show a null effect or an effect in the opposite direction simply because of statistical fluctuations. By comparing different models implementing different set of constraints, it is however possible to quantify the relative predictive accuracy of these models and to conclude on the dominance of an effect. To sum up, a small positive effect may be the sign of a dominant small positive effect (i.e., it is positive for everyone) or it may be the sign of non-dominant effect (i.e., it may as well be negative and large for some participants). In our context, the association between inner speech and peripheral muscular activity may well be a non-dominant effect, with the peripheral muscular activity only being present in some participants, and not being present in some other participants. Further investigations of the heterogeneity of this effect and the comparison of models explicitly incorporating different set of constraints may help resolve this issue.

8.4 Conclusion

In this work we aimed at examining the involvement of the speech motor system during induced rumination in healthy participants. Given the predominantly verbal character of rumination, we sought to examine it using the tools and methods used to investigate inner speech. More precisely, we used surface electromyography and articulatory suppression to probe the role of the motor system during rumination. This investigation led us to the conclusion that the activity of the speech motor system was not necessary for experiencing rumination. Moreover, verbal rumination was not specifically associated with peripheral muscular activity in the speech muscles (albeit facial surface electromyography may be used to assess the presence of rumination). Although these results seem to contradict previous results on the role of the speech motor system during inner speech production, they fit well with a mental-habit account of rumination, in which rumination is considered as an automatic or habitual form of inner speech. We suggest that during the creation of this habit (i.e., through repetition), rumination becomes a strongly condensed form of inner speech that does not critically involve the speech motor system. Moreover, these results also make sense in consideration of the multiple varieties of the inner speech phenomenon that can vary along dimensions of condensation and deliberateness. These results highlight the role of factors such as automaticity and deliberateness in the complex relation between inner speech and the speech motor system. Overall, these results pave the way for new ways of investigating rumination as a form of inner speech, where phenomenological and psychophysiological characteristics of rumination can be related to the long tradition of inner speech research.


  1. More precisely, concluding \(p\) on the basis of \(q\) would be committing the “affirming the consequent” fallacy, known formally as \(\frac{p \rightarrow q, q}{\therefore p}\). In other words, observing \(q\) is insufficient to conclude \(p\) because \(q\) might have been observed for other reasons than \(p\). In our situation, the EMG amplitude might have increased for other reasons than the rumination induction. For instance, it might be that the EMG amplitude was higher after induction only because we compared muscular activity after induction to a relaxation period (which might show a lower than usual level of muscular activity). Alternatively, the increase in EMG amplitude might be due to the fact that participants were doing something, in opposition to doing nothing (i.e., this increase may not be specific to rumination).

  2. We recognise that this formulation may still be incomplete as some additional auxiliary or instrumental hypotheses may still be incorporated in order to draw a more exhaustive picture of the argument.

  3. These results may also be interpreted by saying that the non-specific relaxation (i.e., relaxing the whole body) was also interfering with speech motor processes and with rumination.

  4. By “externalisation” we mean here the degree to which inner speech recruits the speech motor system, with fully externalised speech corresponding to ordinary overt speech production.

  5. We use “internalised” here to mean the opposite of “externalised” in the way we used it a few pages before, that is, a form of inner speech that does not recruit the speech motor system.

  6. By “artificial”, we mean here that the cues triggering rumination have been specifically developed in laboratory settings to induce rumination.

  7. We considered it statistically though, by using multilevel models, when appropriate. However, we did not consider this heterogeneity further in our discussion of the results and in our conclusions.