Without tape measures and weighing scales, obesity research might look quite different. We need tape measures and scales to calculate Body Mass Index (BMI). BMI is an essential variable in epidemiology research concerned with body weight and enables certain statistical approaches. BMI enables intervention trials to be conducted in a certain way, using certain statistical approaches to measure efficacy. Research findings are incorporated into guidelines, recommendations, policy, public health strategies and so on. What would an alternative approach look like? People taking part in research could, for example, be separated into groups such as “healthy” or “overweight” with classification determined by an expert looking at them. Maybe other categories could be added such as “obese” or “underweight”. Researchers now used to working with the precision offered by universal measures of height and weight would probably be worried about the lack of reliability inherent in relying on individuals, no matter how expert, to allocate people to categories using judgement alone. Without ever having had access to tape measures and scales, epidemiology, health outcomes research and public health in this field might be unrecognisable.
Given depression is not subject to gravity, nor can it stand up straight with its feet and back against the wall, how is it possible that epidemiology, guidelines, intervention trials and public health approaches to depression all follow similar approaches to those used for physical health issues like obesity? The answer is that over the past hundred years or so, a scientific field known as “psychometrics” emerged and developed, producing and proliferating metaphorical tape measures for depression and other psychological concepts. In a recent article, we explored these historical developments and how they impacted on our understanding of depression.
In the nineteenth century, prior to psychometrics becoming respectable, people with mental illness were classified by sight. Asylum psychiatrists made clinical observations of inmates and wrote notes. Inmates were rarely asked for their opinion let alone to provide an assessment of their own “experience”. Notes included information that could be ascertained from informants (such as age, family history) or that could be observed (such as physical features, behaviour). There was no tool for measuring improvement. Therefore, there was no expectation that inmates would improve – at least not before they died of pneumonia, cholera or dysentery. Without measuring scales the language of change is limited. Depression at this time was considered part of a “cycloid personality” which was sometimes depressed, sometimes hypomanic. Personality was something you were stuck with – something you had or did not have.
The first attempts in the early 1900s to design questionnaires to measure depression reflected this idea of a fixed personality. Questions required Yes or No responses and positive yes answers could be added up to determine presence of a depressed or pessimistic personality. Meanwhile Psychometrics gathered scientific credentials with a Psychometric Society founded in 1935. Over time more sophisticated techniques developed hand-in-hand with new statistical methods. Likert published his scaling approach in 1932 making it possible to talk about amounts of things like depression as opposed to its presence or absence. Ideas about construct validity, test-retest reliability and eventually sensitivity to change developed along with ways to test these such as factor analysis – first put forward by Spearman in 1904 and developed further by Thurstone in the 1930s. Chronbach published his test of internal reliability in 1951.
It’s possible to see these developments as accompaniments to the march of knowledge in mental health – tools that enabled mental health researchers to understand psychiatric disorders more clearly and become better judges of psychiatric interventions. However, a “social life of methods” approach takes a different perspective. What if the development of these tools actually enabled a new form of discourse around mental illness? Being able to measure degrees of sadness makes it possible to talk about depression not as a fixed personality trait but as something we might all have in different degrees. The asylum wall, which had been a physical boundary between the sane and insane – the Yes and the No – could come down. The Yes people could mix with the No people until everyone is somewhere on the Yes dimension. What if being able to measure change in degrees makes it possible to talk about improvement and deterioration? We can then make precise judgements about the efficacy of different treatments. This moves mental health policy beyond clinical judgement behind closed asylum doors because it now concerns all of us on the Yes dimension. What if the evolution of depression measures follows the psychometric techniques available rather than vice versa such that depression is re-constructed into something defined by what it is possible to measure?
Rather than pharmaceutical conspiracies or psychiatric expansionism, the social life of methods approach provides a narrative less reliant on human agency. We suggest that the logic provided by psychometrics continuously produces and reproduces the construct of depression. Although depression as a construct and field of knowledge is highly contested, critique almost always revolves around attribution of nefarious agency to a group of people rather than a form of unrelenting logic.
This does not mean that groups of people with agency could never attempt to exploit the logic. Another recent article explores how in the 1990s a number of pharmaceutical companies attempted to develop a new type of scale focusing on “quality of life in depression”. The idea stemmed from a culture of consumerism in the 1980s during which the pharmaceutical industry had ambitions to market psychiatric drugs directly to consumers rather than via psychiatrists. They hoped that by refocusing questionnaires on aspects of life important to consumers (social life, family, work) instead of “symptoms” (such as crying and sleep) that people might be more easily persuaded of the benefits of antidepressants. The fixed logic of psychometrics did not easily bend to this idea. It became impossible to clearly establish “discriminant validity” i.e. to demonstrate that depression was a separate construct from quality-of-life among depressed people.
Given the subsequent lack of published data on quality-of-life outcomes from pharmaceutical trials, it seems likely that these new measures failed to demonstrate efficacy of pharmaceutical treatments. Moreover, in spite of much lip service by policymakers to the importance of quality-of-life in health service delivery, no nation has switched from symptoms to quality-of-life as a primary outcome for national depression guidelines. We suggest this may be further evidence that questionnaires have a social life – that the inexorable logic of psychometrics produces its own reality ranging from disease constructs, to research agendas, professional practice, practice guidelines, policy and public health. We suggest these practices would be unrecognisable without depression questionnaires.
About the Authors: Susan McPherson is contributing editor to the Cost of Living blog. David Armstrong is Emeritus Professor of Medicine and Sociology at Kings College London. He is the author of Political Anatomy of the Body (1983) and A New History of Identity (2002).
How blue are you feeling? The social life of questionnaires in measuring depression
by Susan McPherson and David Armstrong Jun 29, 2022Without tape measures and weighing scales, obesity research might look quite different. We need tape measures and scales to calculate Body Mass Index (BMI). BMI is an essential variable in epidemiology research concerned with body weight and enables certain statistical approaches. BMI enables intervention trials to be conducted in a certain way, using certain statistical approaches to measure efficacy. Research findings are incorporated into guidelines, recommendations, policy, public health strategies and so on. What would an alternative approach look like? People taking part in research could, for example, be separated into groups such as “healthy” or “overweight” with classification determined by an expert looking at them. Maybe other categories could be added such as “obese” or “underweight”. Researchers now used to working with the precision offered by universal measures of height and weight would probably be worried about the lack of reliability inherent in relying on individuals, no matter how expert, to allocate people to categories using judgement alone. Without ever having had access to tape measures and scales, epidemiology, health outcomes research and public health in this field might be unrecognisable.
Given depression is not subject to gravity, nor can it stand up straight with its feet and back against the wall, how is it possible that epidemiology, guidelines, intervention trials and public health approaches to depression all follow similar approaches to those used for physical health issues like obesity? The answer is that over the past hundred years or so, a scientific field known as “psychometrics” emerged and developed, producing and proliferating metaphorical tape measures for depression and other psychological concepts. In a recent article, we explored these historical developments and how they impacted on our understanding of depression.
In the nineteenth century, prior to psychometrics becoming respectable, people with mental illness were classified by sight. Asylum psychiatrists made clinical observations of inmates and wrote notes. Inmates were rarely asked for their opinion let alone to provide an assessment of their own “experience”. Notes included information that could be ascertained from informants (such as age, family history) or that could be observed (such as physical features, behaviour). There was no tool for measuring improvement. Therefore, there was no expectation that inmates would improve – at least not before they died of pneumonia, cholera or dysentery. Without measuring scales the language of change is limited. Depression at this time was considered part of a “cycloid personality” which was sometimes depressed, sometimes hypomanic. Personality was something you were stuck with – something you had or did not have.
The first attempts in the early 1900s to design questionnaires to measure depression reflected this idea of a fixed personality. Questions required Yes or No responses and positive yes answers could be added up to determine presence of a depressed or pessimistic personality. Meanwhile Psychometrics gathered scientific credentials with a Psychometric Society founded in 1935. Over time more sophisticated techniques developed hand-in-hand with new statistical methods. Likert published his scaling approach in 1932 making it possible to talk about amounts of things like depression as opposed to its presence or absence. Ideas about construct validity, test-retest reliability and eventually sensitivity to change developed along with ways to test these such as factor analysis – first put forward by Spearman in 1904 and developed further by Thurstone in the 1930s. Chronbach published his test of internal reliability in 1951.
It’s possible to see these developments as accompaniments to the march of knowledge in mental health – tools that enabled mental health researchers to understand psychiatric disorders more clearly and become better judges of psychiatric interventions. However, a “social life of methods” approach takes a different perspective. What if the development of these tools actually enabled a new form of discourse around mental illness? Being able to measure degrees of sadness makes it possible to talk about depression not as a fixed personality trait but as something we might all have in different degrees. The asylum wall, which had been a physical boundary between the sane and insane – the Yes and the No – could come down. The Yes people could mix with the No people until everyone is somewhere on the Yes dimension. What if being able to measure change in degrees makes it possible to talk about improvement and deterioration? We can then make precise judgements about the efficacy of different treatments. This moves mental health policy beyond clinical judgement behind closed asylum doors because it now concerns all of us on the Yes dimension. What if the evolution of depression measures follows the psychometric techniques available rather than vice versa such that depression is re-constructed into something defined by what it is possible to measure?
Rather than pharmaceutical conspiracies or psychiatric expansionism, the social life of methods approach provides a narrative less reliant on human agency. We suggest that the logic provided by psychometrics continuously produces and reproduces the construct of depression. Although depression as a construct and field of knowledge is highly contested, critique almost always revolves around attribution of nefarious agency to a group of people rather than a form of unrelenting logic.
This does not mean that groups of people with agency could never attempt to exploit the logic. Another recent article explores how in the 1990s a number of pharmaceutical companies attempted to develop a new type of scale focusing on “quality of life in depression”. The idea stemmed from a culture of consumerism in the 1980s during which the pharmaceutical industry had ambitions to market psychiatric drugs directly to consumers rather than via psychiatrists. They hoped that by refocusing questionnaires on aspects of life important to consumers (social life, family, work) instead of “symptoms” (such as crying and sleep) that people might be more easily persuaded of the benefits of antidepressants. The fixed logic of psychometrics did not easily bend to this idea. It became impossible to clearly establish “discriminant validity” i.e. to demonstrate that depression was a separate construct from quality-of-life among depressed people.
Given the subsequent lack of published data on quality-of-life outcomes from pharmaceutical trials, it seems likely that these new measures failed to demonstrate efficacy of pharmaceutical treatments. Moreover, in spite of much lip service by policymakers to the importance of quality-of-life in health service delivery, no nation has switched from symptoms to quality-of-life as a primary outcome for national depression guidelines. We suggest this may be further evidence that questionnaires have a social life – that the inexorable logic of psychometrics produces its own reality ranging from disease constructs, to research agendas, professional practice, practice guidelines, policy and public health. We suggest these practices would be unrecognisable without depression questionnaires.
About the Authors: Susan McPherson is contributing editor to the Cost of Living blog. David Armstrong is Emeritus Professor of Medicine and Sociology at Kings College London. He is the author of Political Anatomy of the Body (1983) and A New History of Identity (2002).