The Cognitive Therapy Rating Scale – a rite of passage?

as an External Examiner I found it painful to watch videos of trainees trying to ensure their interview with a client was CTRS (or its’ successor the Revised Cognitive Therapy Rating Scale) compliant. I vividly remember one Course Leader giving a student a score above the competence threshold of 36 (see R-CTRS Manual, link at end of blog ), despite the student barely making eye contact with the client – the latter was busy leaving through the 12 item (each scored 0-6) scale on his lap! Unfortunately institutions were obliged to use it, and students groaned in silence. Before another cohort of trainees endures this rite of passage, those that have the courage to, should consider that the R-CTRS may cause more problems rather than it solves i.e that it is, iatrogenic.

Life Before the CTRS

In the seminal study of CBT for depression by Rush, Beck, Kovacs and Hollon (1977) they did not use the CTRS. Later when Steve Stradling and I conducted a randomised controlled trial comparing individual CBT, group CBT and treatment as usual, [Behavioural Psychotherapy, 18, 1-19]  we simply followed Beck’s protocol, p409-411 (1979) Cognitive Therapy of Depression, by Beck, Rush, Shaw and Emery., published by John Wiley and Sons to achieve good results.

The Poor Predictive Value of the R-CTRS

The CTRS has only been evaluated in a sample of depressed clients undergoing cognitive therapy [Shaw et al (1999)] , therapists scores on this did not  predict outcome on self-report measures the Beck Depression Inventory or the SCL-90 (a more general measure of psychological  distress) however it did predict outcome on the clinician administered Hamilton Depression Scale predicting just 19% of the variance in outcome, but it was the structure parts of the scale (setting of an agenda, pacing, homework) that accounted for this 19% not items measuring socratic dialogue etc. The authors concluded: ‘The results are, however, not as strong or consistent as expected’When the CTRS was first evaluated the results were not compelling’ . I enlarged on this previously in my blog ‘The Mis-Selling of the R-CTRS’


Since my earlier blog I have more recently blogged ‘Jump Through Our Hoops an Make No Difference To Client Outcome’, in which I noted an IAPT study that had used the R-CTRS to predict outcome and found that there was no relationship to outcome.

Cavalier Usage of The R-CTRS

Studies using the R-CTRS tend to be cavalier. In a study comparing BA and CBT, Richards et al (2016) used the R-CTRS but these authors did not report how this or indeed the competence measure for BA related to outcome. Richards et al said that though both modalities were equally effective in treating depression, but BA was to be preferred because it was cheaper to train therapists in BA. They further claim that the CBT therapists were competent with a mean score of 37.9 on the R-CTRS ( but this score is almost identical to the threshold of 36 in the R-CTRS Manual deemed necessary for a competent therapist) so on this metric half of the CBT therapists were not competent. Thus there has been a meaningless implementation of CBT. Paradoxically it may be that the CBT therapists performance had been made worse by having to use the R-CTRS.

Spinning The R-CTRS

Given the paucity of evidence for the utility of the R-CTRS for depression and possible negative side effects one would expect that it would not have been applied to other disorders. Unfortunately trainees are asked to apply it to whatever the client’s complaint or ‘problem descriptor’ as IAPT would have it. Little wonder that trainees are stressed by its’ usage.

Revised Cognitive Therapy Rating Scale

Richards et al (2016)

Spinning CBT Is Ubiquitous

CBT luminaries are spinning the plates furiously this conference season, a paper in next months Behavior Therapy, 50 (2019) 864–885 by clinicians from the University of Sheffield, has an abstract that advocates Group Behavioral Activation for depression as a front line treatment. The abstract also claims a moderate to large effect on depressive symptoms. Most people are unlikely to read further than the abstract, but closer inspection reveals the conclusions are deeply flawed.

In passing the abstract mentions that the standardized mean difference (SMD) between group BA and waiting list was 0.72. This would cause few people to question the findings, but actually it means the results are of doubtful clinical relevance, as it actually means there is less than one standard deviation in outcome between the treated group and the waiting list. Your eyes may already be glazing over at the thought that some stats are on the way, but bear with me. If a group of depressed patients had a mean Beck Depression Inventory Score of 28 at the start of treatment, [assuming that the spread of the results was 7, the standard deviation – taken from the Scott and Stradling (1990) study Behavioural Psychotherapy, 18, 1-19 ] a mean score of 23 at the end of treatment would produce an SMD of 0.71, i.e about the same as in the University of Sheffield analysis. Thus the average person experiencing this change of score is unlikely to feel that they are back to their normal selves, and are likely to view it as part of the normal cycling of mood, influenced by positive events e.g the company/support of fellow sufferers for a time in a group. In none of the Group BA studies was there an independent assessor determining whether clients were still depressed or the permanence of any change. Unsurprisingly the authors found that the Group BA was no better than any other active treatment (i.e controlling for attention and expectation), and make an implicit plea for the Dodo verdict ‘ all therapies are equal and must have prizes’.

In the body of the paper the authors acknowledge that the Group BA studies are of low quality, save one and that analyses were on treatment completers as opposed to the more rigorous intention to treat. But there is no indication anywhere as to what proportion of people recover from depression with any permanence. Yet this did not stop the spin in the abstract! Unfortunately it will likely be music to the ears of IAPT and one can expect Group BA to be soon advocated, particularly as it is contended that BA is easier for therapists to learn than CBT.

In 1990 Steve Stradling and I had published [Behavioural Psychotherapy, 18, 1-19] a study of depressed clients comparing, group CBT, individual CBT and a waiting list condition. For Group CBT the initial mean BDI was 29.0 and end of treatment score was 6.2 whilst for individual treatment the comparable scores were 28.21 and 11.53. However those on the waiting list also improved from 25.89 initially to 20.26 at the end of waiting list. Thus, it is far from clear that the results from the University of Sheffield analysis on Group BA are actually better than those of putting people on a waiting list.

In the August 2015 issue of the Psychologist I wrote:

“In the July issue of the Psychologist you referred to a meta-analysis of 70 CBT studies for depression conducted by Johnsen and Friborg (2015) and opined ‘CBT doesn’t seem to be helping reduce depression symptoms as much today as it used to when it was first developed in the 1970s’. But this conclusion may be premature, inspection of Table One of Johnsen and Friborg’s study shows that from 1977 up to and including the millennium 85% of studies were randomised controlled trials (RCT’s)  but from 2001-2014  the comparable figure was 65%. One of the hallmarks of an RCT is blind assessment, using a standardised diagnostic interview. Thus there can be no certainty that populations treated post the millennium are comparable to those before. Johnsen, T. J., & Friborg, O. (2015, May 11). The Effects of Cognitive Behavioral Therapy as an Anti-Depressive Treatment is Falling: A Meta-Analysis. Psychological Bulletin. Advance online publication.″ Reliance on weak evidence has become a post-millenium phenomenon.

But spin is not confined to recent CBT studies, Jellison et al (2019) have examined spin in leading journals of psychiatry and in the journal Psychological Medicine, of 116 randomised controlled trials spin was identified in 56% of them, with 21% in the abstract results section and 49.1% in the abstract conclusions section. See link below:

Please let me know what work should be given a spin award this conference season.

The Care Quality Commission (CGC) Is Being Duped by IAPT

IAPT is camouflaging what most of its clients receive and has eskewed a focus on clinically relevant outcomes. But one of the domains that the CQC assesses services against is whether they are Outcomes-focused. The CQC needs to conduct an inquiry into IAPT.

Guided Self-Help (GSH) has been the diet of 71% of IAPT’s clients, but therapists have now been advised not to mention GSH, because it may be off-putting! But rather to refer instead to ‘low intensity telephone CBT’ . Notwithstanding that NICE has justified its’ support for low intensity CBT on the basis of studies that were termed ‘GSH’. There is a transparency about offering GSH, clients have a right to know what they are letting themselves in for. Informed consent cannot be meaningfully given to a term like ‘low intensity telephone CBT’.

The matter of informed consent is compounded further by IAPT by their failure to inform clients of what clinically relevant outcome he/she can expect. In particular what minimally important difference the client can expect and clearly see as meaningful. Changes on a psychometric test do not qualify as a clinically relevant outcome by contrast a client can clearly understand say an expectation to be back to their usual self.

IAPT’s ‘low intensity telephone CBT’ itself rests on a fault line, studies that found statistical significance between groups e.g computer assisted CBT vs waiting list, but without a) any discussion of the clinical relevance of the findings and b) blind independent assessment of outcome. Dissemination of the low intensity interventions has been promoted on the back of statistical significance rather than clinical relevance. This makes it imperative that the CQC becomes outcomes focused in a transparent way and is not sucked in by IAPT’s self serving surrogates.

IAPT’s Eventual Implosion

there are no limits to IAPT’s ambitions, making failure inevitable. IAPT’s target in practice is, “whatever the client complains of” and treatment is operationalised as “whatever its’ therapists do”, Both focii are so loose that it cannot fulfill it’s promise, like a totalitarian revolution that runs out of steam.

The IAPT Manual published a year ago leaves both targets and treatment ‘fuzzy’, whilst proclaiming a commitment to NICE Guidelines. A target of ‘client complaints’ makes no distinction between ‘ disorder’ and everyday unhappiness/stresses. Yet the treatments advocated by NICE are quite specific to disorders.

At most IAPT staff ask about some symptoms of a disorder, but without coverage of all the symptoms of a disorder. But they are not taught to ask whether a symptom is present at a clinically significant level, i.e whether it is making a real world difference to a client’s life. Only clinically significant symptoms count in DSM. As a result IAPT client’s are typically treated for disorders they don ‘t have, without any fidelity check on compliance with a protocol.

There is tremendous vested interest, financially, emotionally and intellectually in IAPT continuing as it is, marking its’ own homework with applause from BABCP and the BPS.

CBT’s House of Cards?

applying the acid tests of the Cochrane Collaboration Tool and the GRADE Handbook for the quality of randomised controlled trials, studies of low intensity CBT fail to clear the methodological bar. Whilst only high intensity studies for depression and the anxiety disorders make a successful jump. This calls into question IAPT’s penchant for disseminating CBT for everything, with an imprimatur from BABCP, paying travel expenses of upto £100 for special interest group members to attend a pre-conference workshop Revolution in Mental Health Service Delivery: The Evolution of Low Intensity CBT on Tuesday 3rd September.

One of the seven domains highlighted by the Cochrane Collaboration tool for assessing bias is the blinding of outcome assessment. I have been unable to locate one outcome study of low intensity CBT that fulfills this criteria whilst there are a significant minority of studies of high intensity interventions for depression and the anxiety disorders that do.

The GRADE handbook for assessing the quality of trials comments in section 3.4 ‘not infrequently, outcomes most important to patients remain unexplored’, with regards to psychological interventions clients are rarely asked by someone independent of the study whether and if for how long they are back to their usual selves since treatment. Instead most commonly reliance is placed on a surrogate measure a client completed questionnaire, as opposed to an independent clinicians assessment using a standardised diagnostic interview to determine whether there has been a loss diagnostic status.

These concerns are crystallised in a study of CBT for Health Anxiety conducted by Cooper et al (2017), Behavioural and Cognitive Psychotherapy, 2017, 45, 110–123 doi:10.1017/S1352465816000527

whilst 10 of the 13 studies in a meta analysis used the DSM or ICD-10 to determine whether people should be admitted to the meta analysis, in no study was meeting these criteria used as an outcome measure. To be no longer suffering from the identified health anxiety at end of treatment/follow up would have been a client important outcome. Instead the self-report Health Anxiety Questionnaire was used as surrogate. Cooper et al (2017) attempted to rate studies using the Cochrane Collaboration tool using a summary score for the seven domains, but this bore no relation to outcome and as the authors admitted was a questionable procedure. Despite this CBT was claimed to be an effective treatment for health anxiety.

I am afraid I can’t join in the jamboree for IAPT services that takes place at the BABCP annual conference. I doubt that the ‘House of Cards’ will be discussed and it would likely be seen as banned literature on IAPT training courses.


GRADE handbook

The Gagging of Clients as Storytellers

‘don’t listen to the story treat the symptom’ that is the advise to be given to IAPT’s PWP’s attending a 3 hour workshop on November 28th 2019 on groupwork. It reflects similar advise given to IAPT clients attending a 6 week course on ‘Understanding PTSD’ in which clients are instructed not to talk about their trauma rather to reflect on what they have found helpful so far. This gagging of IAPT clients is consistant with the Organisation’s 30 minute telephone assessment. But it is inconsistant with the need to help client’s overcome cognitive avoidance e.g in PTSD avoiding talking about their trauma.

In Simply Effective Cognitive Behaviour Therapy, Routledge (2009)

I suggested that clients need treatment simultanously for all the disorders from which they are suffering. This is to look at the totality of the clients story, not to elevate one part of it (e.g the disorder that is most impairing) and just treat that. Interestingly Barlow et al 2017 see link below compared focussing just on the main disorder from which a person was suffering (from amongst panic disorder, GAD, social anxiety disorder and GAD, even though most people had more than one disorder) with a protocol that could be adapted for any of these disorders ( termed a Unified Protocol) and retention of clients was better with the latter. This suggests that addressing the whole story is best as well as being more respectful.

Care has to be taken however with Barlow’s transdiagnostic approach, in that the term denotes just those suffering from an anxiety disorder excluding PTSD. Over half of clients had a degree. All treatments were developed by Barlow and his colleagues, there has been no independent replication. Treatment was individual, no evidence that it works in groups. The treating clinicians were highly qualified/trained and did both treatments, as the UP was the new kid in the block and their ‘kid brother’ that may explain the slightly better results with UP.

Barlow et al (2017)

IAPT Misses The Boat Using a Train Timetable

IAPT couldn’t find enough cases of generalised anxiety disorder that a randomised controlled trial comparing CBT with the antidepressant sertraline collapsed, Buszewicz et al (2017) see link below. The metric IAPT uses, problem description is clearly useless as GAD cases are ubiquitous, effecting 4.7% of the population, more common than depression,

Similarly adjustment disorders are ubiquitous but IAPT doesn’t use such a label and engages in treating them then discovers its mistake, what a waste of resources. Dana was distressed by the criminal behaviour of her ex and her children’s exposure to him, she had 4 treatment sessions which she described as helpful, but the service advised that treatment should be suspended and the outcome was ‘mixed’.

Pre 7 13
Post 6 12


Yvonne had a long history of anxiety but no problems in the months before she tripped, injured herself and this initially precluded her use of her main coping mechanism of exercise. She was given treatment for ‘anxiety’ in IAPT, which she described as helpful, but she only had fear of falling a specific phobia this was not addressed at all in treatment. Yvonne had not been asked what would constitute her being back to her usual self i.e what would be a clinically relevant difference post treatment, instead IAPT goes blindly on with its own idiosyncratic metric and claims success on the basis of the changes in scores below:

Pre 19 18
Post 6 7

For speed IAPT weds itself to problem specification, but it doesn’t take clients to their destination of a clinically relevant outcome.

Buszewicz et al 2017

Only The Client Knows Whether Psychological Treatment Has Made a Clinically Relevant Difference

trouble is nobody asks them! When was the last time you remember a client being asked ‘are you back to your usual self with the treatment you have had’? Organisations, such as IAPT have their own metric, a decrease on a psychometric test and in secondary care psychiatrists will opine ‘seems a bit brighter to day, increase…’. These ‘metrics’ ensure the survival of the Organisation, but have no demonstrated relationship to loss of diagnostic status as assessed by a clinician independent of the service provider.

In a study by Stegenga et al (2012) see link below depressed patients were followed up over 3 years whether there depression took a chronic (17%), fluctuating (40%) or remitting course (43%) course they all showed decreases in PHQ9 scores throughout the study and without any psychological intervention. The only exception was a worsening of PHQ9 score at 6 months for the chronic subgroup. Similarly a 12 year study of anxious patients Bruce et al (2005) showed they were only suffering from their anxiety disorder 80% of the time. Thus finding a decreased psychometric test score per se does not mean anything.

Bruce et al (2005) link

Stegenga et al (2012) link

Organisations and Clinical Commissioning Groups much prefer to talk about operational matters, numbers and waiting lists and show no interest or expertise in reliably assessing clinically relevant outcomes. But it is not just these bodies, the leading journals have for the past decade predominantly published papers on the efficacy of psychological interventions with no insistence that there should have been blind independent assessment. Instead self-report measures have ruled with little awareness that their completion is subject to demand effects and the measures often bear no obvious relationship to the construct under examination.

It is difficult to escape the conclusion that clients are largely fodder for the Organisations. A problem that will not be resolved by increased funding for mental health services albeit that this is clearly needed or by atypical clients as tokens on mental health bodies. The fundamental problem is a lack of respect/reverence for clients.

Populist Mental Health Myths

poor psychological therapy services are as much about populist mental health myths, as underfunding. Drill down beyond IAPT and NICE and you enter a sub atomic world very different to that of the orchestrators.

In the microscopic world people are concerned with:

‘will I get back to my old self with this therapy?’

‘what proportion of people like me, get over this with therapy?’

‘are the effects of therapy temporary or permanent?’

‘are you interested in and committed to me, or am I just a number?’

Moving up to the macroscopic world, real world outcomes are replaced by surrogates ‘a change on a questionnaire’ but without any certainty the questionnaire is measuring anything pertinent to what the person is suffering from! There is no independent assessment of outcome of routine practice.

Myth One: IAPT and NICE are at one

IAPT insists that it is NICE compliant, i.e its treatment protocols match the identified condition. But IAPT clinicians do not diagnose, instead they make a judgement using ICD 10 diagnostic codes, this weak surrogate ignores that NICE Guidance assumes a reliable diagnosis and advocates the DSM criteria not ICD10!

Myth Two: IAPT is credible because of its’ advocacy of NICE Guidelines

The NICE guidelines have called for a decade, for an evaluation of low intensity CBT vs counselling vs treatment as usual, which would include observer rating. Such is its’ ongoing uncertainty as to the value of low intensity CBT.

Myth Three: The value of low intensity CBT has been demonstrated

Not if one insists on methodologically strong studies involving independent outcome assessors.

Myth Four: CBT is the answer

NICE points out that even where there is the strongest evidence in favour of the use of CBT in depression the effects are ‘modest’. It also notes that there are comparitively few studies of Behavioural Activation (BA) and NICE makes a clarion call for more head to head research between BA and CBT. But stresses the need for inclusion of observer rated assessment in such a study, they also may have added that there is a need also for an attention control group. There is a need for more humility in IAPT about the contribution of CBT.

Myth Five: Approval by NICE equals evidence of efficacy

Not so, NICE guidelines are the fruits of a committee’s deliberations, about primarily, the results of randomised controlled trials, but there is no assessment of those rcts using the Cochrane risk of bias, which includes requirements such as observer rated outccomes.

Myth Six: IAPT never departs from NICE

With regards to ‘Medically Unexplained Symptoms (MUS) not otherwise specificied’ the recommended specialised form of CBT is entirely a product the IAPT Education and Training Group (ETG). The ETG is also a reference source for the specialised form of CBT for irritable bowel syndrome and chronic pain, albeit that 2 NICE guidelines are also referred to.

Myth Seven: IAPT is becoming more robust in evaluation

Not according to its’ recent forays into disorders like chronic fatigue syndrome were reliance is placed on a psychometric test the Chalder Fatigue scale of doubtful relevance to the CFS construct and without any independent observer rating.

Myth Eight: Real world change can happen without hospitality and commitment

Hospitality is notably absent in client’s first contact with IAPT , therapists are focussed on not becoming the subject of sanction. In the real world initial formulation of client’s problem/s is often in need of significant modification, the time constraints on therapists rarely cater for the necessary adaptations and the importance of persistence on the part of the therapist.

Myth Nine: It is ok to discharge a client as soon as their score hits recovery

For 40% of people experiencing depression, their disorder takes a variable course, whilst for the anxiety disorders, sufferers are only affected 80% of the time. Thus discharging at the first signs of a low score is simply capitalising on chance, there can be no certainty that lasting meaningful change has occurred. The stage is set for a revolving door.

This list of myths is by no means exhaustive, please feel free to add your own. However the microscopic and macroscopic worlds are different universes it seems.

