BABCP Response - NICE Consultation January 2022

IAPT Fails To Rebut Charge Of a Tip Of The Iceberg Rate Of Recovery

In the March Issue of the British Journal of Clinical Psychology, 3 academics admit their links to the Improving Access to Psychological Therapies (IAPT) Service, having failed to do so on an earlier occasion.  Their attempted rebuttal of my paper ‘Ensuring IAPT Does What It Says On The Tin’, published in the same issue of the Journal is a Donald Trump like expose. The British Government is looking at the matter of making NHS England accountable, to date the latter has allowed IAPT to mark its’ homework, with no involvement of the Care Quality Commission. Having spent over £4billion on IAPT the time for change is long overdue. Below is my response to Kellett et al (2021).

Practice-based evidence has been termed a three-legged stool comprising best research evidence, the clinician’s expertise and patient preferences [Spring (2007)]. Wakefield et al., (2021) published a systematic review and meta-analysis of 10 years of practice-based evidence generated by the Improving Access to Psychological Therapies (IAPT) services which is clearly pertinent to the research evidence leg of this stool. In response to this paper I wrote a critical commentary ‘Ensuring IAPT does what it says on the tin’ [ Scott (2021)].  In turn Kellett et al., (2021) have responded with their own commentary ‘The costs and benefits of practice-based evidence: Correcting some misunderstandings about the 10-year meta-analysis of IAPT studies’ accepting some of my points and dismissing others. Their rebuttal exposes an even greater depth of conflicts of interest in IAPT than originally thought. The evidence supplied by Wakefield et al (2021), renders the research evidence leg of the stool unstable and it collapses under the weight of IAPT.


Transparency and Independent Evaluation


Kellett et al (2021) in their rebuttal head their first paragraph ‘The need for transparency and independent evaluation of psychological services’. But these authors claimed no conflict of interest in their original paper, despite the corresponding author’s role as an IAPT Programme Director.  In their rebuttal Kellet et al., (2021) concede ‘Three of us are educators, clinicians and/or clinical supervisors whose work directly or partially focuses on IAPT services’. This stokes rather that allays fears that publication bias may be an issue.

There has been a deafening silence from Kellett et al., (2021) that in none of the IAPT studies, has there been an independent evaluator, using a standardised semi-structured diagnostic interview to assess diagnostic status at the beginning, end of treatment and follow up. It has to be determined that any recovery is not just a flash in the pan. Loss of diagnostic status is a minimum condition for determining whether a client is back to their old selves (or best functioning) post treatment. Studies that have allowed reliable determination of diagnostic status have formed the basis for the NICE recommended treatments for depression and the anxiety disorders.  As such they speak much more to the real world of a client than IAPT’s metric of single point assessments on psychometric test completed in a diagnostic vacuum.


The Dissolution of Evidence-Based Practice

The research evidence leg of IAPT’s evidence-based practice stool is clearly flawed. Kellet et al., (2021) seek to put a ‘wedge’ under this leg by asserting that the randomised controlled trials are in any case of doubtful clinical value because their focus is on carefully selected clients i.e they have poor external validity. But they provide no evidence of this. Contrary to their belief randomised controlled trials (rcts) admit client with a limited range of comorbidity. A study by Stirman et al., (2005) showed that the needs of 80% of clients could be accommodated by reference to a set 100 rcts. Further Stirman et al., (2005) found that clients in routine practice were no more complex than those in the rcts.. Kellett et al., (2021) cannot have it both ways on the one hand praise IAPT for attempting to observe National Institute for Health and Care Excellence (NICE) guidance and then pull the rug on the rcts which are the basis for the guidelines. Their own offering as to what constitutes research evidence leads to the collapse of the evidence-based practice stool. It provides a justification for IAPT clinicians to continue to base their clinical judgements on their expertise ignoring what has traditionally been taken to be research evidence, so that treatments are not based on reliable diagnoses. The shortcomings of basing treatment on ‘expertise’ have been detailed by Stewart, Chambless & Stirman (2018), these authors comment on ‘The importance of an accurate diagnosis is an implicit prerequisite of engaging in EBP, in which treatments are largely organized by specific disorders’.

‘Let IAPT Mark It’s Own Homework, Don’t Put It to The Test’


Kellett et al (2021) claim that it would be too expensive to have a high quality, ‘gold standard’ effectiveness study with independent blind assessors using a standardised semi-structured diagnostic interview. But set against the £4billion already spent on the service over the last decade the cost would be trivial. It is perfectly feasible to take a representative sample of IAPT clients and conduct independent blind assessments of outcome that mirror the initial assessment. Indeed the first steps in this direction have already been taken in an evaluation of internet CBT [ Richards et al (2020)] in which IAPT Psychological Wellbeing Practitioners used the MINI [ Sheehan et al (1998)] semi-structured interview to evaluate outcome, albeit that they were not independent evaluators and there could be no certainty that they had not used the interview as a symptom checklist rather than in the way it is intended. Further the authors of Richards et al (2020) were employees of the owners of the software package or worked for IAPT. Tolin et al (2015) have pointed out that for a treatment to be regarded as evidence-supported there must be at least two studies demonstrating effectiveness in real world settings by researchers not involved in the original development and evaluation of the protocol and without allegiance to the protocol. Kellet et al (2020) have failed to explain why IAPT should not be subject to independent rigorous scrutiny and their claim that their own work should suffice is difficult to understand.


The Misuse of Effect Size and Intention to Treat

Kellet at al (2021) rightly caution that comparing effect sizes (the post-test mean subtracted from the pre-test mean divided by the pooled standard deviation) across studies is a hazardous endeavour. But they fail to acknowledge my central point that the IAPT effect sizes are no better than those found in studies that pre-date the establishment of IAPT, that is they do not demonstrate an added value.  Kellet et al (2021) rightly draw attention to the importance of intention to treat analysis and attempt to rescue the IAPT studies on the basis that many performed such an analysis. Whilst an intention to treat analysis is appropriate in a randomised controlled trial in which less than a fifth of those in the different treatment arms default, it makes no sense in the IAPT context in which 40% of clients are nonstarters (i.e complete only the assessment) and 42% dropout after only one treatment session [ Davis et al (2020)]. In this context it is not surprising that Delgadillo et al (2020) failed to demonstrate any significant association between treatment competence measures and clinical outcomes, a point in fairness acknowledged by the latter author. But such a finding was predictable from the Competence Engine [Scott (2017)] which posits a reciprocal interaction between diagnosis specific, stage specific and generic competences.


Kellett et al (2020) Get Deeper in The Mud Attacking Scott (2018)


Kellett et al (2021) rightly underline my comment that my own study of 90 IAPT clients Scott (2018) was hardly definitive, as all had gone through litigation. But they omit to mention that I was wholly independent in assessing them, my duty was solely to the Court as an Expert Witness.  Despite this they make the extraordinary claim that my study had a ‘high risk of bias’, which casts serious doubts on their measuring instruments. They failed to understand that in assessing a litigant one is of necessity assessing current and past functioning. In my study I included used of the current and lifetime versions of a standardised semi-structured interview the SCID [ First et al (1996)].  This made it possible to assess the impact of IAPT interventions whether delivered pre or post the trauma that led to their claim. Whatever was the timing of the IAPT intervention the overall picture was that only the tip of the iceberg (9.2%) lost their diagnostic status as a result of these ministrations. Nevertheless, as I suggested, there is a clear need for a further publicly funded study of the effectiveness of IAPT with a representative sample of the latter.




Davis, A., Smith, T., Talbot, J., Eldridge, C., & Bretts, D. (2020). Predicting patient engagement in IAPT services: a statistical analysis of electronic health records. Evidence Based Mental Health, 23:8-14  doi:10.1136/ebmental-2019-300133.

Delgadillo, J., Branson, A., Kellett, S., Myles-Hooton, P., Hardy, G. E., & Shafran, R. (2020). Therapist personality traits as predictors of psychological treatment outcomes. Psychotherapy Research, 30(7), 857–870.

First, M. B., Spitzer, R. L., Gibbon, M., & Williams, J. B. W. (1996). Structured clinical interview for DSM-IV axis I disorders, clinician version (SCID-CV). Washington, DC: American Psychiatric Press.

Kellett, S., Wakefield, S., Simmonds‐Buckley, M. and Delgadillo, J. (2021), The costs and benefits of practice‐based evidence: Correcting some misunderstandings about the 10‐year meta‐analysis of IAPT studies. British Journal of Clinical Psychology, 60: 42-47.


Richards, D., Enrique, A., Ellert, N., Franklin, M., Palacios, J., Duffy, D., Earley, C., Chapman, J., Jell, G., Siollesse, S., & Timulak, L. (2020) A pragmatic randomized waitlist-controlled effectiveness and  cost-effectiveness trial of digital interventions for depression and anxiety npj Digital Medicine (2020)3:85 ;

Scott, M.J (2017) Towards a Mental Health System That Works. London: Routledge.

Scott, M.J. (2018). Improving access to psychological therapies (IAPT) – the need for radical reform. Journal of Health Psychology, 23, 1136-1147.

Scott, M.J. (2021), Ensuring that the Improving Access to Psychological Therapies (IAPT) programme does what it says on the tin. British Journal of Clinical Psychology, 60: 38-41.


Sheehan, D. V. et al. The Mini-International Neuropsychiatric Interview (M.I.N.I.): the development and validation of a structured diagnostic psychiatric inter- view for DSM-IV and ICD-10. J. Clin. Psychiatry 59(Suppl 2), 22–33 (1998). quiz 34-57.

Spring B (2007). Evidence-based practice in clinical psychology: what it is, why it matters; what you need to know. Journal of Clinical Psychology, 63(7), 611–631. 10.1002/jclp.20373 [PubMed: 17551934].

Stewart, R.R., Chambless, D.L., & Stirman, S.W (2018) Decision making and the use of evidence based practice: Is the three-legged stool balanced?  Pract Innov 3(1): 56–67. doi:10.1037/pri0000063.  

Stirman, S. W., DeRubeis, R. J., Crits-Christoph, P., & Rothman, A. (2005). Can the Randomized Controlled Trial Literature Generalize to Nonrandomized Patients? Journal of Consulting and Clinical Psychology, 73(1), 127–135.


Tolin, D. F., McKay, D., Forman, E. M., Klonsky, E. D., & Thombs, B. D. (2015). Empirically supported treatment: Recommendations for a new model. Clinical Psychology: Science and Practice, 22(4), 317–338.



Wakefield, S., Kellett, S., Simmonds‐Buckley, M., Stockton, D., Bradbury, A. and Delgadillo, J. (2021), Improving Access to Psychological Therapies (IAPT) in the United Kingdom: A systematic review and meta‐analysis of 10‐years of practice‐based evidence. British Journal of Clinical Psychology, 60: 1-37 e12259.



22 replies on “IAPT Fails To Rebut Charge Of a Tip Of The Iceberg Rate Of Recovery”

accurate diagnosis does not exist in mental health, how can it? these are constructs and diagnosis in this context is just a subjective personal opinion, no matter how many question are asked or boxes are ticked. I worked in CMHT’s for years and was appalled at the harmful nonsense passed off as care- When people get trapped as they often do in services when you get a new psychiatrist you get another diagnosis – people had dozens of labels over many years in services. If we cannot accurately diagnose people in mental health what does this say about the research literature most therapy research is based on? I think this honest psychiatrist does a good job of outlining the issues

the inter rater reliability for depression and the anxiety disorders using a standardised semi-structured interview is as good as for many physical disorders. For example in a just published study by Jiniri et al (2021) of Covid patients, 30% were found to be suffering from PTSD using the CAPS interview and the kappa was 0.82 a very high level of reliability, comparable to the measure of delirium/agitation used. Use of diagnosis does not mean that biology determines psychological state simply that biology is involved. Judgement is aways involved, in the Covid study in the determination of delirium/agitation, inattention, disorganised thinking, evidence from those living with the person , acute changes in functioning/fluctuations are all involved and weighted, it is not a checklist any more than say the CAPS. If physical/psychological assessments were simply a matter of personal opinion the kappa’s would be so low as to be meaningless. Physical assessments also involve judgements, jm’s approach would spell the end of evidence based anything and we would be left with vociferous individuals locked in a battle as to see who can shout the loudest, with life’s casualties given no meaningful help.

reliable diagnosis is just a way of ensuring that we are all singing from the same hymn sheet and not talking at cross purposes. But it comes at a cost, it is time consuming to use a standardised semi-structured interview to achieve a reliable diagnosis, the temptation is to use the 1st part of it an open ended interview. But wholly open ended interviews have proven too unreliable for research and they cannot point the way to any evidence based treatment, as the latter are by and large diagnosis specific. I agree that doing diagnosis in a cavalier manner (as is commonplace) is worst than not doing it all.

Thank you for the response but this is confusing Mike – I wonder how you think it is possible to get a ‘reliable diagnosis’ in MH using the constructs in the DSM? These are inherently unreliable and invalid and are voted into existence by a tiny subset of largely white middle class psychiatrists in the USA, often with close ties to drug companies.

They are not the result of identifiable objective measurable mechanisms outside of the human mind. DSM diagnosis are nothing more than a matter of opinion and purely subjective. This is surely talking at cross purposes and nothing like singing from the same hymn sheet. Get a new psychiatrist or psychologist or therapist and get a new song and sheet to sing from.

How long would this approach last in actual medicine – imagine a GP just interviews the patient for an hour, listens and administers some questionnaires utterly devoid of context or any objective facts, on the GP’s terms and then the GP announces X disorder and recommends a treatment – There would be an outcry because very soon illness and death would explode in the culture. Just look at the outcomes from our mental (ill) health systems rubbish and getting worse no matter the service.

It seems to me we are suffering from many and varied cultural disorders currently undiagnosed and untreated and mental (ill) health services are part of what keeps this obscured from view.

without an agreed concept of reliable diagnosis it would have been impossible for me to invalidate IAPT’s claims, without it the clock is wound back to before Beck – with his 1st work on reliable diagnosis. That diagnosis is misused like nuclear energy it does not mean that it has no utility

IAPT is a failure yes and needs no recourse to DSM social constructions to invalidate its claims. Consider the reductionist meaningless ways in which IAPT understands and tries to measure ‘recovery’. Two tick box questionnaires the PHQ9 and GAD7 and others- as if ‘recovery’ can be quantified in numbers – when recovery is deeply personal, nuanced, fluid and not just a personal issue – it is cultural, political, economic and class based – its about resources enabling possibilities for change. It is not about ‘symptoms reduction’ fleeting and temporary as this is without relevant resources.

IAPT was set up to reduce the benefit bill and to keep people in their god awful harmful jobs – it could be called Integrating Austerity with Psychological Therapy – Medicalised language is everywhere and it seems clear that the people are sickened not helped by it.

Why would it be wrong to move on from Beck? I would much prefer it if this particular reductionist clock went back or forward. If it was actually helpful i’d be all for it but it isn’t.

Human beings are faced with such unfathomable complexity and the idea this can be broken down to simple models is surely self evidently disordered? So many people are encouraged to view their often heroic attempts to cope with myriad cultural disorders as if they are personally disordered rather than responding to what is and has been.

I am sure you are well resourced Mike and therapy research clearly tells us if therapy is helpful at all it is helpful to those that need it least of all, the well resourced. Personal resources are the biggest factor in helpful therapy and IAPT was not set up for these people and it seems more a mass system for internalising cultural disorders and maintaining the disordered status quo.

Leave a Reply

Your email address will not be published. Required fields are marked *