IAPT and The Rogue Driving Instructor

Imagine the would-be driving instructor for your son/daughter, has on public record, that for every person attending 2 or more lessons there is one person who attends just once. Warning lights would flash . But the latest IAPT data, for September last, https://digital.nhs.uk/data-and-information/publications/statistical/psychological-therapies-report-on-the-use-of-iapt-services/september-2021-final-including-reports-on-the-iapt-pilots-and-quarter-2-data-2021-22 show just such poor engagement, with 39,734 having only one treatment appointment and 56,972 having two or more treatment sessions. Further, just as many people fail to follow up their referral (self or GP) , 43,258 as have one or two or more treatment sessions. This suggests that IAPT is not high in the public credibility stakes. The driving instructor may claim a 50% pass (recovery) rate but would you believe them without independent verification? IAPT’s self-proclamation of such a recovery rate lacks credibility.  

IAPT claims 7.9 sessions of treatment per referral, but can this be regarded as sufficiently potent when NICE recommended treatments are typically twice this length? On December 9th 2021 NHS Digital proclaimed that the ‘Improving Access to Psychological Therapies (IAPT) is run by the NHS in England and offers NICE-approved therapies for treating people with depression or anxiety’. Yet neither NICE nor IAPT have provided any evidence of treatment fidelity. Both display what the Chair of the Hillsborough Independent Panel has termed ‘the patronising disposition of unaccountable power’ [ ‘Justice for Christ’s Sake’ by James Jones SPCK (2021)]. The Panel also highlighted 3 necessities for further public enquiries, empathy, equality and candour. It would be empathetic to ask  IAPT clients ‘are you back to your old self with the treatment you have received or alternatively are you back to your best?’. Equality  would mean giving precedence to the client’s definition of their situation, and not an organisational device {PHQ9 and GAD7) administered in such a way as to protect the reputation of the Service. Candour would be allowing IAPT therapists to tell it as it is, no longer too fearful to speak out or having to use such measured tones that the central meaning of what they have to say is lost. 

Dr Mike Scott

 

People Cannot Benefit from a Treatment To Which They Have Not Been Exposed – The Undermining of IAPT

The Improving Access to Psychological Therapies (IAPT) Service does not assess treatment fidelity. Thus, there can be no certainty that clients receive an evidence-based treatment treatment.  IAPT therapies are not EBTs. Despite this, the major funder of IAPT training days SilverCloud, claims on its’ website ‘up to a 70% real-world recovery’ using its computer assisted products, for all common disorders except PTSD and OCD!  The Advertising Standards Authority need to look at this, the ASA has a complaints form that can be completed online. SilverCloud’s UK address is Suite 1350, Kemp House, 152 City Road, London, EC1V 2NX., My own study of 90 IAPT cases suggests just a 10% recovery rate, Scott (2018) https://doi.org/10.1177%2F1359105318755264).

IAPT have produced no evidence that its’ therapists using SilverCloud make any added difference to their clients over and above that of those who didn’t use it. see SilverClouds Space for Depression programme   NICE Guidance ‘Space from depression for treating adults with depression’ Medtech innovation briefing published May 7th 2020. Strangely the NICE IAPT Expert Panel concluded that the case for adoption is ‘partially supported’ despite in the body of report noting lower depression scores, at the end of treatment for the clients of therapists who did not use the computer assisted CBT. An example of spin and conflict of interest.

 

The SiverCloud website cites 10 references appearing in peer-reviewed journals to support its work.  But none of the studies cited by SilverCloud involve blind independent assessors of outcome using a ‘gold-standard’ diagnostic interview. In the cited review study by Wright et al (2019) Wright JH, Owen JJ, Richards D, et al. Computer-assisted cognitive-behavior therapy for depression: a systematic review and meta-analysis. J Clin Psychiatry. 2019;80(2):18r12188 the third author is employed by SilverCloud.

 ‘Real-world’ recovery represents a change that a client would care about, such as no longer suffering from the disorder that they were suffering from before treatment or a return to best functioning. In a footnote SilverCloud defines recovery as ‘Moving from clinical caseness to non-caseness, i.e. lowering the score on PHQ-9 and GAD-7 from above the clinical threshold to below the threshold’. Such changes are meaningless to clients they are not ‘real-world’.

Here is what one client told  me:

‘I found Silvercloud ineffective, generic and not tailored to my personal situation. It wasn’t engaging or helpful and as such I didn’t engage with the website very much. Consequently, the following weekly call with the IAPT therapist  were sometimes made difficult by the fact I hadn’t completed the same questionnaire as the week before or read through articles. I wanted to talk about my situation, my feelings and find out why I was feeling the way I was, but I felt I was just being led back to using the online SilverCloud resource.

‘It was in 2017 that my doctor suggested I try SilverCloud online CBT with telephone support and in September 2017, I started speaking to another IAPT counsellor. He seemed to be a very nice man. After a few weekly calls, he stated that he didn’t believe I was depressed and so he changed the original Silvercloud course I had started and reset it back to a new series of 6 sessions. The weekly calls lasted between 20 minutes to an hour depending on what we discussed, but always concluded with him asking me to log onto SilverCloud and work my way through the programme before our next call. After the requisite 6 sessions finished in February 2018, that was it! No answers, no tools to help me cope, just signed off, discharged, but told I had 12 month access to SilverCloud. I haven’t used the resource since’.

In general the claims of clinicians and supervisors with regards to treatment fidelity do not match those of independent blind-raters [ Waltman et al (2017)https://doi.org/10.1016/j.janxdis.2021.102407], there are vested interests at play.

The author knows of no study of low intensity CBT (guided self-help, group psychoeducation, computer assisted CBT) that has assessed treatment fidelity. Usage of a manual does not guarantee treatment fidelity. Approx. three quarters of IAPT clients receive low intensity intervention on entry to the Service [Davis et al (2020)https://doi.org/10.1136/ebmental-2019-300133].].

IAPT’s approach ostensibly depends on the results of randomised controlled trials of CBT, but a study of remission rates in CBT for anxiety disorders (including OCD and PTSD) Levy, Bryan and Tolin (2021) https://doi.org/10.1016/j.janxdis.2021.102407 showed that in half the studies (8 out of 17) there was a high risk of bias because of a failure to address treatment fidelity. Further in 7 of the 17 studies there was a high risk of bias because of the failure to use blind assessors. [A re-view of psychotherapy trial reports published in 6 top psychiatry journals in 2017 and 2018 revealed that only 59% of the included trials reported adequate blinding of outcome assessors Mataix-Cols et al (2021)]. https://jamanetwork.com/journals/jama/fullarticle/10.1001/jamapsychiatry.2021.1419?utm_campaign=articlePDF%26utm_medium=articlePDFlink%26utm_source=articlePDF%26utm_content=jamapsychiatry.2021.1419].Thus, the research base that IAPT draws upon is far from rock solid.  The remission rate in rcts for anxiety disorders is approx. 50% [ Springer et al (2018) https://doi.org/10.1016/j.cpr.2018.03.002]and this is the ‘gold standard’. But IAPT claims comparable results despite a total disregard for blinding and treatment fidelity! The faked goods ought perhaps to be reported to Trading Standards as well as ASA, in lieu of any interest in the matter from the British Psychological Society (BPS) or the British Association for Behavioural and Cognitive Psychotherapy (BABCP)!

The real story of SilverCloud is that it provides morsels of CBT when what is really needed is a proper meal. It is insulting to clients to in effect say ‘let’s see how you get on with morsels and then we will see about a proper meal’.

 

Dr Mike Scott

New NICE Menu for Depression

The proposed Guidance, published last month, excludes consideration of assessment. Recommendations are  therefore built on sand. Depression can occur in a variety of contexts and alongside other disorders, NICE’s response is that it doesn’t matter so long as there is a high score on a depression psychometric test. The clinician, not the client holds the menu, the former takes them through the options in a set order. For ‘less severe’ depression group CBT is to be canvassed first with clients, next in line is group behavioural activation. Despite the fact that the latter group modality has not been assessed with blind independent assessors.

 

NICE advocates different pathways for ‘less’ and ‘more severe’ depression, advocating a cut-off of 16 on the PHQ-9. De facto the authors rubber-stamp the widely held practice, reflected in the Improving Access to Psychological Therapies (IAPT) Service, of routing high scorers on a depression psychometric test (e.g PHQ-9 score 10 or greater) to treatment for this condition. But patients with a wide range of disorders including, panic disorder, PTSD, obsessive compulsive disorder and adjustment disorder have elevated depression scores. Nevertheless, NICE signals a diversion along a depression pathway with one fork for ‘less severe’ and another for the ‘more severe’. Clinicians and clients are likely to be equally bemused by the ‘road signs’. The upshot is likely to be misguided treatment.NICE have invited the public to Comment on their intended guidance https://www.nice.org.uk/guidance/indevelopment/gid-cgwave0725/consultation/html-content-3 on the treatment of depression. Commentary has to be submitted specifying the particular paragraph that any comment is about, so it is somewhat tedious, and you may well decide to write your Christmas cards instead. 

 

Generalising from Low Quality Studies

In assessing the outcome studies NICE do not take seriously the concept of minimally important difference (MID) i.e what change would a a patient see as the minimum requirement necessary for them to say treatment has made a real-world difference. There is no evidence that they would regard a change of score on a psychometric test as conferring a real-world difference. But they would recognise being back to their old self or best functioning and possibly no longer suffering from the disorder, so that loss of diagnostic status would be a reasonable proxy for a MID. However only a minority of studies furnish this data with the use of blind assessors. Inferences can therefore only be properly drawn from this sub-population of studies, which exclude the low intensity studies. As an exemplar see the comparison of group CBT and group behavioural activation at the end of this document.

 

Pseudo-preferences

 

Under the proposed Guidance client’s preferences are paramount.  If the client is judged as having ‘less severe’  depression and volunteers no treatment preference, they are to be taken through  a menu of options in a set order starting with first group cognitive behavioural therapy, second group behaviour activation, third individual CBT and on to the 11th option short-term psychodynamic therapy.  For ‘more severe’ depression top of the league is individual CBT plus antidepressants, in 2nd place individual CBT, and in 3rd place individual behavioural activation and in last and 10th place is group excercise. The ‘more severe’ route is more labour intensive and there is likely to be congestion as approximately half those entering IAPT have mean scores of 15 or more on the PHQ-9 [Saunders et al (2020) https://doi.org/10.1017/S1754470X20000173]. Unwittingly the Guidance spells the end of low intensity interventions because none of the top of the league options are low intensity! But 70% of clients entering the IAPT service are given a low intensity intervention first. However there is nothing to prevent a Service Provider declaring that ‘unfortunately none of the top of the league options are currently available’ and recourse has to be made to options in danger of relegation.

Psychometric Test Results Can only be Considered in Context

 

The NICE guidance assumes that psychometric test results speak for themselves but they are only meaningful when described in context. To my knowledge there is no study of the reliability of the PHQ-9 in UK routine mental health services compared to a ‘gold standard’ diagnostic interview. Rather data on the PHQ-9 has been extrapolated from from US studies of psychiatric outpatients, in a population with a high prevalence of depression, but not using a ‘gold standard’ diagnostic interview [The Prime MD was used instead, with insufficient distinction between this interview and the questions on the PHQ-9]. It is the author’s experience that in the UK the PHQ-9 gives a large number of false positives compared to a reliable diagnostic interview, such as the SCID.

 

The Need to Contextualise Outcome Studies

NICE has a ‘blind spot’ about context. In its’ analysis of outcome studies it lumps together ‘depression studies’ that were wholly reliant on self-report measures with those that included the results of a diagnostic interview as an outcome measure. Outcome is assessed in terms of statistical differences between either different modes of service delivery e.g stepped v non-stepped or between different treatments e.g CBT v waiting list. There was no attempt to try and discern what proportion of clients in each arm of a study would have regarded themselves as back to their normal selves or best functioning post treatment [ or in lieu of this, lost their diagnostic status] and the duration of those gains. Rather than patients being asked to cite preferences over treatments they largely have no knowledge of, they would be very interested as to the likelihood of treatment making a real-world difference to their lives i.e a difference that they would care about.

 The Need to Consider Effectiveness Studies Not Just Efficacy Studies

NICE’s failure to look at context is highlighted in the top league place it gives to group CBT for less severe depression. No mention that in our study [Scott and Stradling (1990)https://doi.org/10.1017/S014134730001795X ] of individual and group CBT for depression in Toxteth, Liverpool the invitation to group CBT went down like a ‘lead balloon’ and we had to change the protocol to include up to 3 individual sessions in the ‘group’ arm. Entry was determined by independent diagnostic interview, but mean entry Beck Depression scores were around 28, so the population was likely ‘more severe’ in NICE terms. NICE also fails to critically appraise the Group Behavioural Activation studies, having previously called for BA studies to include observer rated assessments. They may have also added the need for credible attention control comparisons. NICE is content with statistical sweeps at large data sets rather trying to discern what is happening at the coal face.

Ignoring the Pandemic

NICE puts group interventions as top of the league for less severe depression, but ignores the context of the pandemic, realistically how possible will it be two get 2 therapists together with 8 clients for 90 minutes a week for 8 weeks, all face to face. with masks? The logistics and effectiveness of conducting it online is a venture into the unknown. NICE appears to operate without contextualisation of findings.

 

Failing to Pay Attention to the Detail of Group Interventions

In 2019 Kellett et al published a paper in Behavior Therapy, 50 (2019) 864–885 the abstract advocates Group Behavioral Activation for depression as a front line treatment. The abstract also claims a moderate to large effect on depressive symptoms. NICE appears not to have read further than the abstract, but closer inspection reveals the conclusions are deeply flawed.

In passing the abstract mentions that the standardized mean difference (SMD) between group BA and waiting list was 0.72. This would cause few people to question the findings, but actually it means the results are of doubtful clinical relevance, as it actually means there is less than one standard deviation in outcome between the treated group and the waiting list. If a group of depressed patients had a mean Beck Depression Inventory Score of 28 at the start of treatment, [assuming that the spread of the results was 7, the standard deviation – taken from the Scott and Stradling (1990) study Behavioural Psychotherapy, 18, 1-19 ] a mean score of 23 at the end of treatment would produce an SMD of 0.71, i.e about the same as in the University of Sheffield analysis. Thus the average person experiencing this change of score is unlikely to feel that they are back to their normal selves, and are likely to view it as part of the normal cycling of mood, influenced by positive events e.g the company/support of fellow sufferers for a time in a group. In none of the Group BA studies was there an independent assessor determining whether clients were still depressed or the permanence of any change. Unsurprisingly the authors found that the Group BA was no better than any other active treatment (i.e controlling for attention and expectation), and make an implicit plea for the Dodo verdict ‘all therapies are equal and must have prizes’.

In the body of the BA paper the authors acknowledge that the Group BA studies are of low quality, save one and that analyses were on treatment completers as opposed to the more rigorous intention to treat. But there is no indication anywhere as to what proportion of people recover from depression with any permanence.

In 1990 Steve Stradling and I had published [Behavioural Psychotherapy, 18, 1-19] a study of depressed clients comparing, group CBT, individual CBT and a waiting list condition. For Group CBT the initial mean BDI was 29.0 and end of treatment score was 6.2 whilst for individual treatment the comparable scores were 28.21 and 11.53. However those on the waiting list also improved from 25.89 initially to 20.26 at the end of waiting list. Thus, it is far from clear that the results from the University of Sheffield analysis on Group BA are actually better than those of putting people on a waiting list.

Dr Mike Scott

 

 

 

 

 

The Proposed NICE (Mis)Guidance on the Treatment of Depression

excludes consideration of assessment https://www.nice.org.uk/guidance/indevelopment/gid-cgwave0725/consultation/html-content-3, in it’s’ update of the 2009 Guidance [CG90], despite advocating different pathways for ‘less’ and ‘more severe’ depression, advocating a cut-off of 16 on the PHQ-9.! De facto the authors rubber-stamp the widely held practice, reflected in the Improving Access to Psychological Therapies (IAPT) Service, of routing high scorers on a depression psychometric test (e.g PHQ-9 score 10 or greater) to treatment for this condition. But patients with a wide range of disorders including, panic disorder, PTSD, obsessive compulsive disorders and adjustment disorder have elevated depression scores. Nevertheless NICE signals a diversion along a depression pathway with one fork for ‘less severe’ and another for the ‘more severe’. Clinicians and clients are likely to be equally bemused by the ‘road signs’. The upshot is likely to be misguided treatment.

In assessing the outcome studies NICE do not take seriously the concept of minimally important difference (MID) i.e what change would a a patient see as the minimum requirement necessary for them to say treatment has made a real world difference. There is no evidence that they would regard a change of score on a psychometric test as conferring a real world difference. But they would recognise being back to their old self or best functioning and possibly no longer suffering from the disorder, so that loss of diagnostic status would be a reasonable proxy for a MID. However only a minority of studies furnish this data with the use of blind assessors. Inferences can therefore only be properly drawn from this sub-population of studies, which exclude the low intensity studies.

Under the proposed Guidance client’s preferences are paramount.  If the client is judged as having ‘less severe’  depression and volunteers no treatment preference, they are to be taken through  a menu of options in a set order starting with first group cognitive behavioural therapy, second group behaviour activation, third individual CBT and on to the 11th option short-term psychodynamic therapy.  For ‘more severe’ depression top of the league is individual CBT plus antidepressants, in 2nd place individual CBT, and in 3rd place individual behavioural activation and in last and 10th place is group excercise. The ‘more severe’ route is more labour intensive and there is likely to be congestion as approximately half those entering IAPT have mean scores of 15 or more on the PHQ-9 [Saunders et al (2020) https://doi.org/10.1017/S1754470X20000173]. Unwittingly the Guidance spells the end of low intensity interventions because none of the top of the league options are low intensity! But 70% of clients entering the IAPT service are given a low intensity intervention first. However there is nothing to prevent a Service Provider declaring that ‘unfortunately none of the top of the league options are currently available’ and recourse has to be made to options in danger of relegation. So much for NICE Compliance and patient choice. 

The NICE guidance assumes that psychometric test results speak for themselves but they are only meaningful when described in context. To my knowledge there is no study of the reliability of the PHQ-9 in UK routine mental health services compared to a ‘gold standard’ diagnostic interview. Rather data on the PHQ-9 has been extrapolated from from US studies of psychiatric outpatients, in a population with a high prevalence of depression, but not using a ‘gold standard’ diagnostic interview [The Prime MD was used instead, with insufficient distinction between this interview and the questions on the PHQ-9]. It is the author’s experience that in the UK the PHQ-9 gives a large number of false positives compared to a reliable diagnostic interview, such as the SCID.

NICE has a ‘blind spot’ about context. In its’ analysis of outcome studies it lumps together ‘depression studies’ that were wholly reliant on self-report measures with those that included the results of a diagnostic interview as an outcome measure. Outcome is assessed in terms of statistical differences between either different modes of service delivery e.g stepped v non-stepped or between different treatments e.g CBT v waiting list. There was no attempt to try and discern what proportion of clients in each arm of a study would have regarded themselves as back to their normal selves or best functioning post treatment [ or in lieu of this, lost their diagnostic status] and the duration of those gains. Rather than patients being asked to cite preferences over treatments they largely have no knowledge of, they would be very interested as to the likelihood of treatment making a real world difference to their lives.

NICE’s failure to look at context is highlighted in the top league place it gives to group CBT for less severe depression. No mention that in our study [Scott and Stradling (1990)https://doi.org/10.1017/S014134730001795X ] of individual and group CBT for depression in Toxteth, Liverpool the invitation to group CBT went down like a ‘lead balloon’ and we had to change the protocol to include up to 3 individual sessions in the ‘group’ arm. Entry was determined by independent diagnostic interview, but mean entry Beck Depression scores were around 27, so the population was likely ‘more severe’ in NICE terms. NICE also fails to critically appraise the Group Behavioural Activation studies, having previously called for BA studies to include observer rated assessments. They may have also added the need for credible attention control comparisons. NICE is content with statistical sweeps at large data sets rather trying to discern what is happening at the coal face.

NICE puts group interventions as top of the league for less severe depression, but ignores the context of the pandemic, realistically how possible will it be two get 2 therapists together with 8 clients for 90 minutes a week for 8 weeks, all face to face. The logistics and effectiveness of conducting it online is a venture into the unknown. NICE appears to operate without contextualisation of findings.

NICE are open to commentary on the proposals upto January 12th 2022. Will send the above, but I don’t think I will receive a return Christmas Card any time soon. Nevertheless a Happy Christmas to everyone.

 

Dr Mike Scott