‘Psychometric Tests, Administered In Isolation, Are Not Footprints of Anything’ – IAPT’s Big Mistake

IAPT uses psychometric tests to identify ‘cases’ and changes in test score to gauge effectiveness.  This is not an evidence based assessment and without it there can be no evidence based treatment.

Image result for wrong direction

A psychometric test can’t exist in a vacuum it has to refer to something tangible i.e it must have criterion related validity. For example in last month’s British Journal of Psychiatry, Quinlivan et al [‘Predictive accuracy of risk scales following self-harm’] assessed the ability of risk scales to predict whether a person will make a further suicide attempt  (the criterion). It was found that the much used scales, did not in fact predict self-harm, i.e they lacked criterion validity. Thus when psychometric tests such as the PHQ-9 (an intended measure of depression) and GAD-7 (an intended measure of generalised anxiety disorder) are used, individual test results are only meaningful if they are actually the ‘footprint’ of the construct under examination. Imagine seeing a footprint in the snow:

 

 

does it relate to the abominable snowman, a polar bear, a human being or the great yeti? Without a specification of what it refers to changes in the footprint found are meaningless.  Thus when IAPT use the PHQ-9 and GAD-7 in isolation it is not known to what they refer, as no reliable diagnostic interview has been performed. Is the person simply stressed, depressed, worried well or what? The myriad possibilities likely have very different trajectories e.g the stressed improving as the stressor passes. Lumping them altogether, creates confusion, prevents any evidence based assessment, which is the foundation for evidence based treatment. Clients cannot be reliably signposted to anything, resulting in the wrong tools being used:

Worryingly, I wrote a rejoinder to a paper by Ali et al in this month’s  Behavior Research and Therapy, on relapse after IAPT low intensity intervention, making the point that they had abused psychometric test results in just this way, it was rejected, the reviewers pointed out that I hadn’t included a reference supporting criterion related validity!  I despair. The reviewers tried to justify the approach of Ali et al on the grounds that the PHQ-9 is a reliable instrument, identifying 80% of those who are depressed (sensitivity) and 80% of those who are not depressed (specificity), which is true. But this provides no basis on which to judge whether Mr X who scored say 25 on the PHQ-9 should a) be regarded as a ‘case’ of depression and relatedly b) whether his progress should be charted with this measure, a) and b) can only be determined by a reliable standardised diagnostic interview, which is absent from the IAPT assessment protocol. If you found your electrician was measuring current with a voltmeter you would, forgive the pun be ‘shocked’, we need to create a similar state of alarm about the quality of audit in IAPT. There is a pressing need for independent rigorous assessment.

Dr Mike Scott

Leave a Reply