The Phobic Avoidance of Attending to Real World Mental Health Outcomes


Michael Scott

When I look at mental health research, I notice a startling avoidance of real-world outcome measures. It seems almost phobic. Yet this type of outcome should be considered the most important. After all, who cares whether some arbitrary measure goes up or down slightly after a week or two? What we care about should be whether people have improved quality of life over the long term. Can they get back to doing the things they used to do? Do they participate in the world, socially, at work? Do they enjoy their hobbies?

So why do researchers avoid asking these questions?

One big reason is that researchers are incentivized to find a positive effect. The motto of academia is “publish or perish,” and everyone knows that null effects are rarely published. But your job may depend on your ability to publish your next study. Even worse, plenty of researchers are funded by the pharmaceutical and device industries—corporations that obviously are hoping you find a nice effect for their drugs and devices.

Even with the best of intentions, though, the people who are testing therapies are often the people who invented the therapy and their disciples—who obviously have at least an unconscious bias, hoping that their personal theory works!

So, consciously or unconsciously, researchers tend to accept a lower threshold for proof of effectiveness. It’s difficult to actually improve people’s real lives significantly, and it’s a lot easier to use a ton of arbitrary metrics and find at least one “statistically significant” effect over a short time. The upshot is, to paraphrase the Dodo in Alice in Wonderland, “all medications and psychological therapies are winners and all must have prizes.”

And it seems that the media, politicians, and midlevel healthcare bureaucrats similarly have no interest in examining the validity of outcome measures. Instead, they pass on oversimplified understandings and glib slogans as if they encapsulate the nuances of what is actually quite controversial research. Most have the best of intentions to be a “mental health advocate,” and they’re told by establishment figures that any criticism of the existing system would be “stigmatizing” and “stop people from getting treatment”— treatment that we only assume works, again, based on arbitrary statistical outcomes over the short-term, not real-world improvement in the long-term.

In the worst-case scenario, researchers and activists who note the misleading research and conclusions dripping with “spin” in an attempt to improve the system are called “antipsychiatry” and marginalized within their own communities.

One searches in vain for studies that ask, after treatment, “Are you back to your old self?” and, importantly, “for how long?” These are the outcomes that patients really care about. Without such questions it is impossible to chart the trajectory of a person’s functioning. Such questions are at the heart of really listening to the patient. Without that, any therapeutic edifice crumbles. But it is not rocket science, just basic respect!

At best, and rarely, studies will report on the proportion of people who lose their diagnostic status—“recovered”—as assessed by an independent clinician. But these don’t indicate the duration of recovery. Do you lose your diagnostic status after two weeks, but then worsen again by a month?

Symptom Reduction vs Added Value

Finding the right psychological treatment for the right disorder is the window through which CBT researchers have gazed for decades. Likewise, psychiatrists have gazed through a similar window, which van Os and Guloksuz call “finding the right medication for the right brain disease.” Whether therapists or psychiatrists, researchers and clinicians have looked predominantly at symptom reduction, rather than whether treatment has provided added value to the client’s life. And all of this is usually rated by the clinician— rarely do we ask clients what they think about the treatment.

There has however been some limited success in the application of CBT to depression and some anxiety disorders, at least in randomised controlled trials. But even here researchers conclude “CBT is probably effective in the treatment of MDD, GAD, PAD and SAD; that the effects are large when the control condition is waiting list, but small to moderate when it is care-as-usual or pill placebo; and that, because of the small number of high-quality trials, these effects are still uncertain and should be considered with caution.”

Similarly, other researchers found that CBT had a large effect for treating OCD, and a moderate effect for treating PTSD. But beyond these DSM diagnoses, there is a dearth of credible supportive evidence.

Evolution or Dissolution?

It is the 50th Anniversary of the British Association for Behavioural and Cognitive Psychotherapy, the self-proclaimed lead organisation for CBT in the UK. The recent annual conference included a keynote speech called “On the Evolution of Cognitive Behaviour Therapy: A Four-Decade Retrospective and a Look to the Future.”

But evidence that it has evolved is sparse to non-existent. In 2008, Ost examined the methodology of what were then termed third-wave CBT therapies and concluded that the methodology employed made them significantly less reliable than the early pre-millenium CBT studies. He opined that the third-wave therapies would not qualify as evidence- based, despite yielding evidence of significant effect sizes. The evidence for the small, incremental changes in complexity and greater effectiveness of CBT is simply not there. Rather than evolution, we have evidence of the operation of the 2nd law of thermodynamics, in that therapeutic energies are being made available in less useful ways—dissolution.

Dissolution Under the Microscope

The PICOTS framework is a mnemonic used by the FDA to define evidence-based medicine. The “O” refers to outcomes and the FDA argues that these must be “outcomes that matter to patients and which predict long-term successful results.” Essentially, no cooking the books with small but statistically significant differences in outcome between an intervention and its comparator (the “C” of the mnemonic), ideally an active placebo.

The “P” stands for population, with a prerequisite to specify clearly who received the intervention, so that other researchers can replicate the findings with the same group of people. The “I” stands for intervention and requires a clear elaboration of what the treatment involved. For psychological therapies, this means the publication of a manual. The “T” refers to timeframe: how long have the treatment effects lasted. Finally, “S” refers to the treatment setting (e.g., primary care).

Over the past 40 years, psychological therapy (mainly CBT) studies have increasingly paid lip service to PICOTS. They have progressively looked less like the original pioneering efficacy studies. There has been a drift to reliance on self-report measures to define a population (P), as opposed to defining a population with a “gold standard” diagnostic interview—largely on the grounds of cost and expediency. Outcomes (“O”) have been progressively less likely to be assessed by independent blind raters.

For example, since the millennium there has been the development and evaluation of low-intensity CBT (typically defined as 6 hours or less of therapist contact). In none of these has there been an independent blind rater; outcome has always been assessed by

self-report and rarely has a diagnostic interview served as the gateway into the study. Yet, in the UK, these low-intensity treatments are the first-line treatments for depression and the anxiety disorders.

Not only has the National Institute of Health and Care Excellence (NICE) endorsed the usage of low-intensity CBT, but they have recently advised that in the first instance therapists should market eight sessions of group CBT for depression.

The lack of any credible evidence on real-world impact and duration of gains troubles them not. It appears an answer to the managerial dream of throughput. Therapies are accessed and patients axed.

CBT and Antidepressants in Practice

There is nothing in the arrangement of routine psychological therapy services that guarantees that a) the “right” disorder will be identified and b) the “right” treatment will be forthcoming. Routine services, such as IAPT in the UK, do not make diagnoses. In a just- reported paper by Clark et al (2022), IAPT clinicians were asked to refer patients to a social anxiety disorder study, but only half the patients referred were found to have the disorder in the study diagnostic assessment.

Thus, left to their own devices, the routine clinicians would have been providing inappropriate treatment to 1 in 2 patients. There can be no certainty that the treatment provided in routine practice is a bona fide treatment, as fidelity checks have never been made. Fidelity checks are disorder specific, with matching treatment targets and interventions. For example, in depression, tackling the loss of the pleasure response (anhedonia) with activity scheduling.

There is a potency of treatment gap between the interventions used in randomized controlled trials and their translation into routine practice. A paper published in the Journal of Psychiatric Research last year showed a 25% response rate for those who had antidepressants and manual-driven psychotherapy (mostly CBT), no better than antidepressants alone. This compares with a 31% response rate in those given a placebo in other studies.

Proper translation of the benefits of treatments identified in randomised controlled trials cannot be done on the cheap. It requires rigorous reliable assessments and a commitment to fidelity. But the latter has to be accompanied by the flexibility of adaptation to the individual. Respect and reverence of patients’ perspectives are paramount. Without funding bodies going beyond operational matters of numbers/waiting times and focussing on real world outcomes, the promise of randomised controlled trials will not be realised. There is a pressing need to return to basics by measuring treatment effects in the real- world.

In practice, there is also unfettered discretion when it comes to a clinician’s choice of which client problems to tackle, in what order and with what evidence-based protocol.

It is, however, possible for individual therapists to deliver quality therapy. I have outlined the specifics of this in Personalising Trauma Treatment: Reframing and Reimagining. I have termed this “restorative CBT”—returning the person to their old self. In this work, the uniqueness of the individual is recognised (e.g., “what does the trauma mean to you today?”), yet at the same time commonalities are recognised, such as the state of “terrified surprise” (a combination of exaggerated startle response and hypervigilance) experienced by those most debilitated by trauma.

Unfettered Discretion on Outcome Measures

In their important book Noise, published last year, Kahneman et al highlight the poor levels of agreement on matters as diverse as judicial sentencing and psychiatric diagnosis. Such disparities are clearly unfair. But there is also heterogeneity of outcome measures. This makes it possible for authors to claim positive benefits in the absence of any real-world demonstration of effectiveness. Researchers have had a field day with unfettered discretion on outcome measures, facilitating the quest for positive findings and heightening the likelihood of publication.

Clients have a right to expect that primary outcome measures should be meaningful to them. The danger is that because of a power imbalance, clients defer to the conclusions of the professionals on outcome and, in Kahneman et al’s terms, a “respect-expert” heuristic (rule of thumb) comes into play. As a consequence, the client is likely to be continually short-changed.

A Psychological Wellbeing Practitioner Breaks The Wall of Silence


I will never forget how, when I started working at the IAPT call-centre, I was stressed and rested my head for a few moments. I was interrupted by a “clinical psychologist” who in accusatory tone proclaimed that, “it does not seem that you are working”. The die it seems was cast. Not once in my 3 years as a Psychological Wellbeing Practitioner (PWP) have I felt that anyone at work cared for one another.



‘When I Want Your Opinion I’ll Give It To You’


Naively I thought that “psychological services” would be a haven of openness, not a venue as “hellish” as any other sales related job. Contentious issues were not allowed to be placed on the agenda at meetings. If I dared to bring up issues that mattered, the Managers would “have a word with me in private”. It felt like “The Twilight Zone” and “Twin Peaks”; you could feel something was not right, but everyone pretended that things were fine and that it was me who was the problem. If there was any issue with what I said, no one gently told me, instead they went straight to my manager. So, I always felt paranoid that whatever I say or do, may be reported.

I will never forget the moments where I would try to bring up a new approach or new knowledge only to be told “it is not in line with NICE and IAPT” and “do not read extra information because you will not need it”. A re-enactment of George Orwell’s 1984, rather than the delivery of a 21st Century psychological service. Worryingly this seems to be the norm in the NHS, with the frontline troops powerless. 


What It Is Really Like At The Coal Face


The short end of it all is that being a PWP is very similar to run of the mill call-centre, telemarking and sales job. No matter what the average worker says “but we do a great service”, I feel they are a tad bit delusional. I do not blame them. To survive this job you either need to resort to trickery or delude yourself that you are doing something worthwhile. The latter group probably have a mortgage to pay. We are told what to say, how to say it, when to say it and constantly told “it’s all about the numbers/targets”. We also have a script, which is very similar to those phone contract customer service people. The hellish brilliance of IAPT is that if the targets are not reached, the organisation uses an attributional bias to blame the “practitioners”/miners and not the “system”/pit owners and fellow travellers.


The Re-Branding of What Doesn’t Work, Doesn’t Work


Pre-IAPT there were “mental health workers (MHWs)”, and the public had some idea of the discharge of this particular, professional role. But from 2008 MHWs became Psychological Wellbeing Practitioners, leaving the public and professionals scratching their head as to what the designation might mean. Where PWPs to be regarded as professionals or not? Despite the inherent confusion, I followed my work’s advice to the letter: did the questionnaires, kept the original scores and ploughed onwards. However, what I noticed is that many clients (I dislike using the term patients because it doesn’t feel like we are official clinicians either) were finishing treatment or dropping out with “high scores”. It was not too long until I was interrogated for a below 50% recovery rate.


Jumping Through The Hoops of ‘Recovery’


The recovery rate of 50% is impossible unless one manipulates the numbers or manipulates the clients to be compliant. I guess, good old fashioned “sales tactics” (convincing people they need a product or that they are better than when they started). Of course, the Managers did not care. Safe to say, I found a crack in the system: since the powers all care about numbers, if you deliver the numbers, they will not question you. However, dare you dip below what is expected of their Key Performance Indicators (KPIs), then they are like bloodhounds searching for you. But there has never been a real world KPI that a client would recognise, such as being back to their old selves for at least 8 weeks after treatment. Instead clients are expected at each session to doodle on questionnaires in the prescence of the PWP and bizarrely, these are used as the metrics of recovery.


At the coal face, I can conceal, to a limited extent what I am doing from the powers that be and deliver something of benefit. I do not hound them for the questionnaires every single time because let us face it, that creates a major barrier in treatment. Also, we are not MDs or Clinical Psychologists that can diagnose. It is a joke when we have to collect the data because it is meaningless.




The issue then becomes that I did not feel like I was learning anything. All I was learning was how to manage office politics and be a better liar. One could apply for High Intensity Training but they still focus on targets, so, no thanks. Any person of good conscience will not last long in IAPT. If you have any issues as a worker with IAPT, they will say it is a “you” problem. I once mistakenly vented my frustrations with how they were doing things at a meeting. This resulted in evident displeasure and near the end the next meeting was told to “this is not a space to vent grievances”. If the clients and workers had a platform to vent their frustrations, I do not think IAPT would still be operational. 


PWPs Ambassadors For A ‘Failed State’?


Working in IAPT is robotic: clicking tabs, ticking boxes and collecting numbers – a                                de-humanising experience.  There is little to encourage anyone to become a PWP. In fairness I suppose, at least a personal level, I have survived lockdown financially. But the service has in effect been “cooking the books” and making the company look good. I fear for the mental health not only of the ambassadors but for that of clients past and to come.


I am off to other pastures, can you wonder at the turnover?


Bernice ( a pseudonym)



Re-referrals to IAPT Typically Attend only One Treatment Session Because It Is So ‘Noisy’

The Improving Access to Psychological Therapies (IAPT) gatekeepers, Psychological Well-being Practitioners (PWPs) appear not to learn from their experience Cairn et al (2014). A product of unfettered discretion.

In a study of 50 re-referrals Cairns et al (2014), failing to engage or dropping out from treatment accounted for a large proportion of referrals: two referrals, 75%; three referrals, 60%; four referrals, 58%; and five referrals, 50%. 

A Failure of Supervision

This debacle is not surprising as Painter (2018) found PWPs had an average of 2.5 minutes to discuss a case with their Case Manager. PWPs are not supervised in any meaningful sense of the word, there is precious little opportunity for reflection or emotional support. Supervision is a necessary part of any organisations monitoring of the  quality of output. But the only ‘supervision’ PWPs receive is admonishment if IAPT’s 50% recovery rate is not achieved and for which they are the judge and jury.

Disagreement Between Therapists Is Rampant

IAPT fails to grasp that there is no evidence amongst PWPs of agreement on which are the important difficulties, which ones should be tackled first and with which protocol. By its’ own admission IAPT staff do not diagnose. As such they fail the entry requirement for accessing NICE protocols. They are like burglars caught holding the loot protesting that they are NICE compliant. Rather IAPT staff operate with unfettered discretion. 

Kahneman et al 2021 in their book ‘Noise’ have shown how, unfettered discretion wreaks havoc from judicial sentencing to psychiatry. The ‘Noise’ from IAPT is positively deafening, with clients ushered in every conceivable direction. The IAPT orchestra plays as the Titanic sinks.

How PWPs Become Disorientated

Cairns et al (2014) put the workings of PWPs under the microscope  and found that in their sample of 50 re-referrals, the taxonomy of problems, was represented by Table 1. But they had no signpost to indicate which problem should be the focus or of which order to tackle the problems or which NICE protocol to follow. IAPT’s ‘Problem Descriptors’ are in Kahneman et al 2021 terms a heuristic (rule of thumb) to bypass the effortful demands of a standardised reliable diagnostic interview, resulting in chaos.

Table 1
  Two referrals 31 patients Three referrals11 patients Four referrals 6 patients Five referrals2 patients
Alcohol 4 2 1 2
Anger 4 2 1 2
Anxiety 18 8 6 2
Bereavement 7 2 0 0
Body image 1 1 1 0
Debt 2 0 0 0
Depression 24 10 6 2
Drugs current 2 1 1 0
Drugs history 4 0 0 0
Domestic violence 0 0 2 0
Eating disorder – bulimia 0 3 1 0
Mental disorder 1 2 1 2
Obsessive-compulsive disorder 3 1 2 0
Physical abuse 6 0 0 1
Panic attacks 0 1 0 0
Post-natal depression 0 1 0 0
Relationships 9 3 5 0
Sexual abuse 3 4 1 0
Self harm 2 3 0 0
Social isolation 1 1 0 0
Stress 2 0 1 0
Unemployment 2 0 1 0
Violence 0 1 0 1
Work issues 4 0 0 1

The problem is that one PWPs anxiety case could be a colleagues depression case, like in Alice in Wonderland the terms mean whatever the PWP wants them to mean. Further the list is arbitrary, no mention of PTSD, specific phobias or social anxiety disorder. IAPT has I think the record on ‘Noise’, with no published kappa’s – a measure of interrater reliability.


Dr Mike Scott

The Entire IAPT Process Is Based on Deception

Is it possible to stop fake psychological therapy by telling the truth? The client is at a railway station to board a train to where he/she knows not, but seeking a better life. They may have arrived at the station under their own steam and/or at the promptings of family/friends/GP. But the ‘trip advisors’  have rarely visited/evaluated the destinations. In social psychology terms the advisors have not engaged in effortful central processing of outcome data. Bypassing the latter with a heuristic (peripheral processing) that the IAPT service’ must be good because it is NHS/Government funded and in any case the mental health burden will be shared out’.


The client believes that they will encounter mental health professionals who can reliably diagnose and treat whatever disorder they have. But nobody told them the service does not make diagnoses [IAPT Manual (2019). At the 30 minute telephone assessment the Psychological Wellbeing Practitioner does not tell them that: a) they are not trained to diagnose b) nor trained to provide psychological therapy and c) in the first instance they will likely undergo low intensity CBT of undetermined potency in treating depression and the anxiety disorders. The deception makes for easy boarding of the IAPT train. The PWPs cram the clients onto the train by using low intensity interventions but unsurprisingly 69% of PWPs suffer burnout, whilst the rate of burnout amongst high intensity therapists is 50% [ Westwood et al (2018)] . To help mitigate their stressors the PWPs have regular supervision, but this to is a deception, as they often have less than 3 minutes to discuss a case. Over a third (38%) [Psychological Therapies Annual Report (2020-2021)] of clients get off the train before their 2nd treatment session, but that is not at places where they want to be. Nobody told them at the start of this level of dissatisfaction. Likewise nobody told them only the tip of the iceberg reach a destination where they have lost their diagnostic status Scott (2018), 9 out of 10 remain at square one. But IAPT keeps up the pretence advising the ‘trip advisors’ and Clinical Commissioning Groups of a 50% recovery rate. A million a year now enter the IAPT gates. It is difficult to escape the parallel as to how people were conned into Auschwitz.


Dr Mike Scott

Antidepressants and CBT in The Real World

A 24%  response rate combining the two, and manual driven psychotherapy conferred no added benefit Bartova et al (2021). In the podcast from Mad In America these findings are set against a 31% placebo response rate. Further no evidence that the interventions altered the course of a disorder, which is the prime objective of treatments for physical disorders. Rather the focus was on symptomatic relief. Articles covered in podcast include:


Pies and Dawson (2022) have today taken up the cudgel to attack the findings of Moncrieff et al (2002) that were the springboard for the podcast. But they are disingenuous in claiming that no one of academic credibility has ever suggested that low serotonin causes depression. For decades, at least in the UK this has been the dominant message given to patients, with the implication that they need antidepressants to restore the chemical imbalance. Pies and Dawson (2022) have recourse to a biopsychosocial model which posits interactions of thoughts, feelings, social factors and biology, in which will be found some biological factor that is of key importance in the development of depression and through which antidepressants will be found to work. But given the track record to date this seems unlikely and provides little basis for current pharmacological practice with the exception of the use of lithium. 

In the Bartova et al (2021) study the therapists claimed that they were adhering to a manual driven psychotherapy protocol, but no fidelity checks were made. A  similar scenario to the claim made by IAPT in the UK that it delivers CBT, but without any independent corroboration. It is I believe the case that CBT can make a real world difference for depression and the anxiety disorders if appropriately delivered.


Dr Mike Scott

What Is The Simplest Explanation of How Clients Fare In The Improving Access to Psychological Therapies Service?


The philosophical principle of Occam’s Razor suggests that the simplest explanations are usually the correct one. Most will present to IAPT at their worst and there will be some improvement with the passage of time and attention. But clients could just as easily have benefitted from attending the Citizens Advice Bureaux,  i.e there is no added benefit from IAPT. Last year over a third (38%) of those who accessed the IAPT service attended one or less treatment session, and it is unlikely that they would benefit from such a sub-therapeutic dose of therapy.  The suspicion is therefore that IAPT doesn’t work. If one tries to explain the therapeutic gains of defaulters (defined by IAPT as attending less than 2 treatment sessions) from the Service, complexity enters. The confusion is not lessened when one tries to explain how it is that completer’s attending on average of 7.5 sessions, apparently make gains comparable to those in randomised controlled trials, with just half the number of sessions! IAPT’s claims beggar belief.

CBT is allegedly ubiquitous in the Improving Access to Psychological Therapies (IAPT) service. Overall a 50% recovery rate is claimed. How then is it effective with one out of two completers of treatment but also ineffective with one out of two? We enter the black hole again.

It is axiomatic amongst CBT adherents that negative cognitions and avoidance behaviours perpetuate negative emotional states. It is further assumed that targeting these maintaining agents will resolve the negative emotional state. But this latter scenario will only unfold if the negative cognitions and avoidance behaviours are pivotal in the onset of the negative emotional state. If a person is suffering from, say chronic fatigue syndrome the salience of negative cognitions and avoidance behaviours may be questionable. The biopsychosocial model of CFS advanced by Deary et al 2007 is of such complexity, that no aetiological agent e.g child neglect, could be ruled out. Applying Occam’s Razor the likelihood is that a primary physical basis for CFS will be found or that it actually covers a range of disorders each with a different biological base. 

In the case of depression negative life events and neuroticism are strong predictors. But neuroticism could be the driver for negative cognitions and avoidance behaviours. However neuroticism itself maybe a product of a particular style of engaging in mental time travel, in which negative events are given a particular salience and homage is paid to them with avoidance behaviour. It is scarcely credible that 7-8 sessions of CBT therapy will nullify the effects of neuroticism/mental time travel for a period that the client would see as clinically meaningful e.g 8 weeks.

Dr Mike Scott