To study the efficacy of psychotherapeutic treatments, researchers use validated self-report questionnaires to quantify symptoms. Based on randomized controlled study designs (RCTs), evidence for therapeutic efficacy is gathered. In an RCT, symptom levels are assessed for a group of patients pre- and posttreatment, compared to a group of patients who have received an alternative treatment or no treatment at all (a placebo). RCTs are regarded as the best type of evidence in therapeutic research. In meta analyses, multiple RCTs are reported and this serves as a basis for large-scale policy decisions on the organization of evidence-based mental health care. When using RCTs, one requires significance tests to show that the treatment group performed better than the control (alternative/no treatment group). In psychotherapeutic research, evidence is thus based on numbers (differences between the groups and p-values for significance). In this article, it is described how patients who participate in the research translate their story into numbers, so that their story becomes ‘data’ that is used as evidence.
In psychotherapy research, numerical data is collected using standardized self-report questionnaires: participants choose one of several answers. However, this comes with issues. First, the respondents/participants have to be able to understand the questions. This is not always the case: sometimes a question is interpreted differently by someone. Participants may also give a different response, based on what they believe is expected of them. This can lead to over- or underestimation in responses. Second, respondents have to score their behavior on numerical scales. This scoring is affected by the format! For example, participants feel more comfortable to score “extreme” on a -5 to +5 scale compared to a scale going from 0 to 10. The reason for this is that participants feel that a ‘0’ in the -5 to +5 scale reflects ‘neutrality’, while in the scale going from 0 to 10 they feel like it reflects ‘absolute absence’. Third, the type of answers that respondents can provide are limited and do not allow for nuances. Lastly, respondents find it hard to decide which reference point they have to keep in mind while scoring (so: does this question refer to my whole life, the last year, the last week?).
These issues are just part of a broader problem. In psychology, the human beings are both the object and the subject of the science, which means that we have to ask the object (the human) to study their own behaviour (the subject). When humans are asked to reflect on their feelings, thoughts, attitudes and behaviors, the idea of being observed or assessed may invoke a change in their behaviour (which is called the Hawthorne Effect). For example, someone can answer in a socially desirable way or someone could emphasize a set of complaints when they know that they are being observed. It is hard to deal with these issues, because each human is different.
In psychotherapy, these problems are even more prominent because in psychotherapy the objects of interest are behaviours, thoughts and feelings that already deviate from the norm. In the clinical practice, patients often experience a lot of different symptoms. However, in research it is often required to isolate different symptoms, to be able to engage in causal attribution.
Even though there are these kind of epistemic problems, this is hardly talked about in empirical papers. Instead, the numerical approach is still regarded as the most reliable and trustworthy means to obtain evidence in psychotherapy research. The idea is that when RCTs are designed properly, when the used questionnaires/measures are valid, and when the data-analysis is reliable and transparent, then the evidence will be sound. However, trs his reasoning leaves out that the value of the evidence, even if the analysis is perfect, still depends on the input into the analysis. This means that the data that is collected has to be valid in the first place to make any research valid. In this article, the aim is to analyze the validity of the ‘data’ collected in psychotherapeutic research. They look at how patients’ stories are translated into numerical data in psychotherapy research. To do so, they discuss a case of a participants who participated in one of their psychotherapeutic studies (SCS).
Participants
This study makes use of a ‘case’, which is an example of an in-depth analysis of a person. In this study, June was the case. June is 26 year old and was referred to treatment by her general practitioner. June experiences an intense and overwhelming fear to be sick/dysfunctional. She fears to have different kinds of ‘psychological and physical diseases’ and she is afraid to go crazy. When she hears, reads, talks or even thinks about specific problems that other people have, she becomes obsessed with the fear of having the problem as well. These problems range from depression, paranoia, and substance dependence to cancer and intestinal problems. June was selected because she made a lot of annotations on paper-and-pencil questionnaires, which make her a ‘rich case’.
Therapy
June received 58 psychotherapeutic session. The therapist was a 36-year old male with experience in psychotherapeutic therapy. He works on the principles of supportive-expressive therapy, as defined by Luborsky. This is a form of psychodynamic treatment which focuses on creating a supportive therapeutic environment to pursue cognitive and emotional understanding.
Research team
The case study in this article was conducted by a research team consisting of the first, third, and fourth author. The first author had the task to conduct follow up on the data collection. However, the researcher never spoke to the patient because she was not reachable. All contact with the patient was conducted by the therapist. To avoid interpretation bias because of therapeutic experience over narrative analysis, the therapist was not involved in the analyses of the case data. The findings were discussed with the therapist (the second author) to reach consensus on the interpretations.
Materials
The data in this article were obtained in a naturalistic psychotherapy study (SCS). In this SCS, all therapy sessions were recorded. Furthermore, after each therapy session, the patients completed self-reports and the therapist wrote a clinical report. Examples of the used questionnaires were:
The General Health Questionnaire (GHQ-21). This is a 12-item self-report measure that assesses general mental wellbeing in the previous week. For example: “Have you recently been able to enjoy your normal day-to-day activities?”. These items are answered on a 4-point Likert scale, in which 0 = Not at all, 1 = Not more than usual, 2 = A bit more than usual, and 3 = A lot more than usual. The GHQ was administered after each therapy session.
The Beck Depression Inventory-II Dutch Version (BDI-II-NL). This questionnaire measures the presence and severity of depression symptoms with 21 items. In the Dutch version, each item has a title that indicates a symptom, for example “aversion of one-self”. This is answered by a 4-point Likert scale. An answer is scored by circling the number before the sentence. For example, the item “suicidal thoughts or wishes” provides the following options: 0 = I don’t have any thoughts of killing myself, 1 = I have thoughts of killing myself, but I would not carry them out, 2 = I would like to kill myself, and 3 = I would kill myself if I had the chance. In the current study, the BDI was administered following every eighth therapy session.
The Depressive Experiences Questionnaire (DEQ-nl). This questionnaire assesses personality patterns that are associated with depressive symptoms and it contains 66 items, for example: “It is not ‘who you are’, but ‘what you have achieved’ that counts.” The items are scored on a 7-point Likert scale, with 0 = Not at all to 7 = Extremely. In the current study, the DEQ was administered following every eighth therapy session.
The Inventory of Interpersonal Problems (IIP). The IIP assesses interpersonal dynamics from the responder’s perspective, specified for the relationship with their parents. It consists of respectively 64, 32, and 32 items. An example is: “I find it hard to feel good about the happiness of others.” The respondent has to answer how strongly this item reflects themselves. Items are scored on a 5-point Likert scale with 0 = Not at all and 4 = Very strongly. The IIP was administered following every eighth therapy session. The shortened 32-item version was administered following every therapy session.
The Personal Style Inventory (PSI). This is a 48-item that assesses personality styles that are related to depression. For example: “I often put the needs of others before my own”. These items are scored on a 6-point Likert scale, ranging from 1 = Completely disagree to 6 = Completely agree. The PSI was administered following every eighth therapy session.
The Symptom Checklist (SCL-90). The SCL-90 assesses mental wellbeing and deviation and it has 90 items. Examples are: “To what extent were you hindered by one of the following complaints” during the previous week?”. The items have to be scored for the previous week and are answered based on a 5-point Likert scale, with 0 = Not at all to 5 = Extremely. The SCL was administered after every eighth therapy session.
June came to the therapy for a period of three years: weekly, bi-weekly or monthly. After the 53th week, she stopped coming to therapy. After two years, she restarted her treatment. She received 58 sessions of treatment. After three sessions with the therapist, she agreed to participate in a pilot psychotherapy study. She signed an informed consent form. She also agreed to participate in follow-up interviews, but as noted before, she did not show up to these interviews. In the psychotherapy study (the SCS), different methods were used: quantitative, qualitative and biological. All sessions were recorded (audiotaped) so that the sessions could be transcribed. After each therapy session, the paper-and-pencil questionnaires were administered. June requested to complete the questionnaires in the presence of the therapist, and this request was accepted. The therapist did not look at the scores and he brought the paper-and-pencil questionnaires to the university. Then, the first author put the data into SPSS. Doing so, the therapist noticed the amount of visual and textual annotations on the questionnaires and therefore June was purposively selected for a case study.
This article is about the experience of questionnaire administration and not about the results. Therefore, the experiences and visualizations that June reported are discussed, and not the quantitative scores on the questionnaires.
How was the data analyzed?
The narrative data were analyzed based on the principles of ‘interpretative phenomenological analysis’. First, June’s therapeutic narrative was analysed qualitatively in two different research projects. The first project (HU & FT) focused on the narrative expression of complaints and anxious obsessions, and the second project (ED & FT) focused on the narrative experience of questionnaire administration. In both projects, the researchers first familiarized themselves with the data by transcribing and re-reading it. After, they coded the data in a data-driven manner. Both projects made use of iterative analysis. There were no hypotheses proposed, because this could interfere with a data-driven analysis.
Three main/core themes in June’s complaints are discussed.
First, June is afraid to be ‘crazy’ or ‘ill’. She fears to be diagnosed with depression, bipolar mood disorder, paranoia, loss of control, substance dependence, panic or anxiety disorder, chronic fatigue, personality issues, HSP, and other things. She does not fear to have a specific disorder or disease, she just fears to have something. June also seems to be susceptible for suggestion. When she hears about a random condition, she fears to have it herself. For example, she reports about suicide:
Like, sui-, the word suicide com-comes up in my mind and then I am [extremely – in dialect] afraid, I don’t want that [breathes heavily]. Like and then, then it is.. Yeah I am here by myself so in theory I could.. dó something like that, there is no one here to control me, like.. (Session 25)
Whenever June speaks, she analyses her words and says things like: “That is a sign of depression, isn’t it?”
Second, June is afraid to be locked up in a psychiatric institution. June searches for signs of abnormality, because she is afraid that others will perceive her as being crazy. She thinks that when others find her crazy, they might lock her up. Consider the following narrative:
Well, that is that ultimate fear, like, yes, you will be imprisoned, or put in psychiatry, like in a small room or something. And no one will come to you, like, no interaction or no communication or, like, not being recognized as a person but as a patient or something. Or as a “lunatic.” Like, like not normal anymore. (Session 2)
In sum, June’s overt phobic fears seem to stem from a fear of deviating from the societal norm and being an outcast.
Third, June is afraid to be “too much and too little”. June’s relation with societal norms is visualized in a drawing on the questionnaire. She explains her drawing by stating:
With this picture. A box, that is society…. You can even see it in micro or like I don’t know, in 2D. And that is me… that.. that smear, and those are the lines of society and sometimes I have hiatuses, and other times I jump beyond them. And I compare that, or I found that on society, and then I see like [gasps for breath ostentatiously]. So what I do is, those smears that go beyond, those I don’t wanna have, those have to go away. I want control over them, and on the other hand it frightens me, those hiatuses, like [gasps for breath ostentatiously], like “Oh no.” So I would really like to be exactly on the box. (Session 58)
To fit in the ‘box’, she tries to control her feelings and her thoughts, so that it is neither too much or too little. She tries to find understanding of her feelings and thoughts. She also feels an urge to visualize, and to make her understanding as concrete as possible. One ambivalence in June’s behaviour is that she searches to fit societal norms, but she does not want to coincide with those norms. Thus, she does everything she can to control herself, but on the other hand she is afraid that she might coincide with the box. Consider the following narrative:
“[I am] afraid to be put in a cage and [I want to] escape from that” (Session 11).
First, June is conflicted about the meaning of items. This means that she can not interpret the items in a straightforward manner. She takes a lot of time to decide which interpretation is “right”. Consider the following narrative:
“I am too generous towards others” … Does [this item] mean materialistically or do you mean…? For me that is really… contradicting. For me, that is materialistic…. I am very egoistic, but uhm… Regarding stuff, like borrowing something that they may use or like, with my time that they… Then I am ver-.. a bit too generous. […] Do I have to write that down like that, or not? Or? (Session 1)
Because of this, she scores the item “I am too generous towards others” by writing down “materially” (scored a 0), and writing down “time & services” (scored a 4 out of 4).
Another example refers to her answers on the item “I am a very independent person”. She scores this item both a 7 (yes) and 1 (no). She annotates: “when I am anxious, I feel a need for someone to hold me”. She also sometimes divides items into different parts and provides multiple answers to one item.
She also often changes the instructed time frame, so that it better fits the reference that she has in mind. For example, the GHQ is scored on a 4-point Likert scale with 0 = Not at all, 1 = Not more than usual, and 3 = A lot more than usual. June adds a category called “A bit less than usual”. June is also very ambivalent about scores: she often crosses scores and changes them to another number. She also marks two numbers and connects them with a line, indicating that she prefers a score in the middle of those two numbers.
June’s use of symbols and visual mark-ups indicate that she finds it important to elaborate on her answers.
Second, June is apprehensive about the confrontation with items in and the evaluation by questionnaires. This happens because topics can make her fearful. She also visualizes these fears by writing “from this thought, I get scared”.
Third, June is fueled by the act of questionnaire completion. This refers to that June was apprehensive of the questionnaire items and that she requested to complete them with the therapist. During this process, June regularly asks the therapist about the meaning of items and instructions and scores, and so forth. It seems that asking questions calms her down. At the end of the 53th session, June tells the therapist that she would like to speak with him about how questionnaire items “suggest” things for her or “bring things to her mind”. However, after this session, she does not return for a number of years. In the 54th session, she explains that she felt that her anxiety was under control and that she did not need therapy anymore. Later she experienced panic attacks and decided to call the therapist for a session. In this session, she explained that the questionnaires brought phobic thoughts to her mind.
The questionnaire as suggestion
When June hears, reads, sees or even thinks about an illness, she becomes fearful and experiences thoughts that she has an illness or psychological disorder. This means that her fears are provoked and increased by the questionnaires in the study. The questionnaires thus became a catalyst of June’s phobic fears. This means that the questionnaire may become a feared stimulus itself. This explains why she requests the therapist to stay with her during administration: this makes her feel more relaxed.
The questionnaire is not only a primer of June’s fears, it also creates new phobic fears. For example, multiple questions about suicide lead June to wonder whether she would ever be capable of doing something like that. Even though she does not experience any suicidal thoughts, she gets scared she might be able to do something like it. This impacts the way she scores the items. For example, she is afraid that she might have a personality disorder. Therefore, she is inclined to score her alleged narcissistic traits, because she feels that she needs to be “honest” in her scoring. However, this means that the scores based on her fear and not on her actual experience.
The questionnaire as judgmental other
The story that June tries to tell via the questionnaire seems to be addressed to ‘another’, which could be a concrete person or a even broader, society. This is indicated by the amount of details and remarks she puts next to the items. However, June also fears the verdict of others. This means that the questionnaire becomes a substitute for the evaluating other, which scares June because she is afraid that others will perceive her as ‘crazy’. She does not trust herself, so she feels a strong need to keep control of herself and to prevent others of perceiving what she thinks she can perceive herself.
The questionnaire as a cage
On the one hand, June does everything to fit in, and she wants to not fall out of the ‘normal’. She is afraid that she might be locked up by others if she does so. Therefore, she tries to find any signal that shows that she is weird. Thus she tries to fit the ‘box’ of society, but she is very afraid to be locked up in a ‘cage’. This leads to that she is afraid to ‘tick’ specific answers, because she is afraid that these answers may lead her to be locked up. For every answer that she provides, the tries to be as nuanced as possible and tries to avoid leaving an image that is open for interpretation. In terms of the questionnaire, June tries to find all issues that could imply craziness. She then evaluates her own behaviours and thoughts based on this questionnaire. However, every box on the questionnaire increases the risk of being understood within some specific image. This means that the questionnaire is a tool to avoid rejection by others, but it also becomes a warrant to be rejected by others. This makes June feel trapped, and in her attempt to escape, she internalizes all the signs so that she could negate any possible symptoms before they could even appear to others. June ends up controlling herself, and therefore the questionnaire becomes her cage.
June shows that questionnaires are not always the best method. She experienced a lot of problems trying to interpret the questionnaire. She felt conflicted about the meaning of items, and she found it hard to interpret items in a straightforward way. She also felt limited by the provided scales, indicated with the amount of visualizations and remarks. This is consistent with other experiences of questionnaire administration in other research settings. For example, Malpass et al. (2016) found that patients often find it hard to ‘fit’ their answers in the scales. Some patients call the scales ‘too simplistic’. This means that the scores that patients provide do not always reflect their experience. This can cause differences in interpretation by different respondents, and this can harm between-subjects comparability and eventually lead to an entirely invalid measurement.
Second, Schwarz (1990) noted the communicative problems of scales. This also played a role for June she appeared to be aware of the audience of the questionnaires. This also fits with other findings, for example the finding that substance abuse patients like to answer the questionnaires in the presence of the administrator, so that they could clarify their thoughts behind their answers. Another study showed that, when participants are asked to ‘think out loud’ during questionnaire completion, they often hesitated, which indicates that they try to fit in the social situation. This means that questionnaires are often used as a tool to communicate with another person.
However, from the perspective of research, June’s tendency to communicate directly to the administrator poses a threat to validity of research. The researchers has to make a difficult decision: should I put June’s data into a standardized format that is often used to analyse data, which means that only one of the answers that June provided is used as data. This means that a lot of information is lost. The researcher could also decide to input every response, but this then makes it hard to compare to other participants. Both options form a threat to the accuracy and legitimacy of the data, because there are only two options: lose information, or lose comparability. This means that comparability is a specific epistemic choice.
June was also aware of the evaluation of others who read her questionnaires. This makes it clear that questionnaire administration in psychotherapeutic research is not only about gaining information from ‘the object of interest’, but that it also involves an active process of meaning-making by the subject who is responding. Questionnaire items have to, in addition to make sense in the common way, also be meaningful within the personal context of the subject. For example, in one study with questionnaire administration after job loss, the participants interacted with the questionnaires to create stories that fit their experiences. Participants use three strategies to deal with questionnaire items: 1) reject it, 2) construct their experiences in a way that matches the questionnaire’s questions, and 3) reformulating the questionnaire’s questions to match their own experiences. This means that the conclusions stemming from questionnaire data are crafted by the respondents from an idiosyncratic point of reference. This means that data do not just represent a report drawn from the respondent, or that it is a general process by which demand characteristics by the researcher impact the respondents answers. Instead, it is dyad between question and answer that has to be made sense of.
McClimans (2011) argued that it is impossible to determine how legitimate a question is before any answers are obtained. This means that it is impossible to determine whether standardized questions are legitimate, before knowing participant’s answers. Instead, to validate the legitimacy of questions, we should be looking into the content of the answer, just like was done in the current study. This means that researchers should genuinely listen and try to learn from the respondents’ answers and ask them what they mean. Another issue is that asking questions can interfere with answering. For example in the case of June. This is referred to as ‘performativity’, which means that descriptions of symptoms are used to describe experiences: based on the description, participants evaluate their own experiences.
June’s case shows that the validity of data is threatened in multiple ways. It lead to cognitive and communicative issues as Schwarz (1990, 2007) described. This means that the act of data collection may interact with the variables that are being studied.Further, it it is vital for the validity of research to scrutinize the validity of data: it is important to understand what the data means, instead of just checking the validity of measures. Currently, psychological research does not seem to be equipped properly to scrutinize the validity of data collection. As Cronbach and Meehl (1995) stated, validity is not a feature of the instrument itself, instead it depends on a valid application in a particular research context. This means that measures should be infused with qualitative questions on the meaning of answers. Also, for statistical analysis to be valid, the input to the analysis should be valid. June’s case shows that validity problems involve more than discussed in the literature. Questionnaire administration can affect the complaints, just because it is measured.
In psychotherapy research, numerical data is collected using standardized self-report questionnaires: participants choose one of several answers. However, this comes with issues. First, the respondents/participants have to be able to understand the questions. This is not always the case: sometimes a question is interpreted differently by someone. Participants may also give a different response, based on what they believe is expected of them. This can lead to over- or underestimation in responses. Second, respondents have to score their behavior on numerical scales. This scoring is affected by the format! For example, participants feel more comfortable to score “extreme” on a -5 to +5 scale compared to a scale going from 0 to 10. The reason for this is that participants feel that a ‘0’ in the -5 to +5 scale reflects ‘neutrality’, while in the scale going from 0 to 10 they feel like it reflects ‘absolute absence’. Third, the type of answers that respondents can provide are limited and do not allow for nuances. Lastly, respondents find it hard to decide which reference point they have to keep in mind while scoring (so: does this question refer to my whole life, the last year, the last week?).
This study makes use of a ‘case’, which is an example of an in-depth analysis of a person. In this study, June was the case. June is 26 year old and was referred to treatment by her general practitioner. June experiences an intense and overwhelming fear to be sick/dysfunctional. She fears to have different kinds of ‘psychological and physical diseases’ and she is afraid to go crazy. When she hears, reads, talks or even thinks about specific problems that other people have, she becomes obsessed with the fear of having the problem as well. These problems range from depression, paranoia, and substance dependence to cancer and intestinal problems. June was selected because she made a lot of annotations on paper-and-pencil questionnaires, which make her a ‘rich case’.
June was also aware of the evaluation of others who read her questionnaires. This makes it clear that questionnaire administration in psychotherapeutic research is not only about gaining information from ‘the object of interest’, but that it also involves an active process of meaning-making by the subject who is responding. Questionnaire items have to, in addition to make sense in the common way, also be meaningful within the personal context of the subject. For example, in one study with questionnaire administration after job loss, the participants interacted with the questionnaires to create stories that fit their experiences. Participants use three strategies to deal with questionnaire items: 1) reject it, 2) construct their experiences in a way that matches the questionnaire’s questions, and 3) reformulating the questionnaire’s questions to match their own experiences. This means that the conclusions stemming from questionnaire data are crafted by the respondents from an idiosyncratic point of reference. This means that data do not just represent a report drawn from the respondent, or that it is a general process by which demand characteristics by the researcher impact the respondents answers. Instead, it is dyad between question and answer that has to be made sense of.
Add new contribution