February 23, 2024

Nurturing Noble Wellness

Navigate the Path to Nourished Living

CHQ- SocioEmo: Identifying Social and Emotional Support Needs in Consumer-Health Questions

7 min read

Data collection

We utilized the popular community question answering, “Yahoo! Answers L6” dataset18. The dataset is made available by Yahoo! Research Alliance Webscope program to the researchers upon providing consent for using data for non-commercial research purposes only. The Yahoo! Answers L6 dataset contains about 4.4 million anonymized questions across various topics along with the answers. Additionally, the dataset provides various question-specific meta-data information such as best answers, number of answers, question category, question-subcategory, and question language. Since the focus of this study is on consumer health, we restricted ourselves to the questions whose category is “Healthcare” and the language is “English”. To further ensure that the questions are from diverse health topics and are informative, we devised a multi-step filtering strategy. In the first step of filtration, we aim to identify the medical entities in the questions. Towards this, we use Stanza19 Biomedical and Clinical model trained on the NCBI-Disease corpus for identifying medical entities. Next, we selected only those question threads with at least one medical entity present in the question. With this process, we obtained 22, 257 question threads from Yahoo! Answers corpus. In the final step, we remove any low-content question threads. Specifically, we retained the questions having more than 400 characters, because longer questions tend to include a variety of needs and background information of health consumers. The final data includes 5,000 question threads.

Annotation tasks

We used our own annotation interface for all annotation stages. We deployed the interface as a Heroku application with PostgreSQL database. Each annotator received a secure account through which they could annotate and save their progress. We started with smaller batches of 20 questions, and gradually increased the batch size to 100 questions as the annotators became more familiar with the task. The first 20 questions (trial batch) were the same among all annotators, so the annotators worked on the task in parallel. Their annotations were first validated on a trial batch, and they were given feedback to help them correct their mistakes. They were qualified for the main annotation rounds after demonstrating satisfactory performance on the trial batch. In addition, group meetings were conducted to discuss disagreements and document their resolution before the next batches were assigned.

The following aspects of the questions were annotated:

Demographic information includes the age and sex mentioned in consumer health questions.

Question Focus is the named entity that denotes the central theme (topic) of the question. For example, infertility is the focus of the question in Fig. 1.

Emotional states, evidence and causes

Given a predefined set of Plutchik-8 basic emotions20, annotators label a question with all emotions contained. The annotators were allowed to assign none, one or more emotions to a single consumer health question, for example, a question could be annotated as exhibiting sadness or a combination of sadness and fear. Below are the included emotional states along with their definitions.

  • Sadness: Sadness is an emotional pain associated with, or characterized by, feelings of disadvantage, loss, despair, grief, helplessness, disappointment, and sorrow.

  • Joy: A feeling of great pleasure and happiness.

  • Fear: An unpleasant emotion caused by the belief that someone or something is dangerous, likely to cause pain, or a threat.

  • Anger. It is an intense emotional state involving a strong uncomfortable and non-cooperative response to a perceived provocation, hurt or threat.

  • Surprise. It is a brief mental and physiological state, a startle response experienced by animals and humans as the result of an unexpected event.

  • Disgust. It is an emotional response of rejection or revulsion to something potentially contagious or something considered offensive, distasteful, or unpleasant.

  • Trust. Firm belief in the reliability, truth, ability, or strength of someone or something. That does not include mistrust or trust issues.

  • Anticipation. Anticipation is an emotion involving pleasure or anxiety in considering or awaiting an expected event.

  • Denial. Denial is defined as refusing to accept or believe something.

  • Confusion. A feeling that you do not understand something or cannot decide what to do. That includes lack of understanding or communication issues.

  • Neutral. If no emotion is indicated.

Alongside, we distinguish between emotion evidence and emotion cause, and we ask annotators to label both accordingly.

  • Emotion evidence is a part of the text that indicates the presence of an emotion in the health consumer question, so annotators highlight a span of text that indicates the emotion and cues to label the emotion.

  • Emotion cause is a part of the text expressing the reason for the health consumer to feel the emotion given by the emotion evidence. That can be an event, person, or object that causes the emotion.

For example, the sentence, “Do you think my outlook is a good one?”, shown in Fig. 1 is evidence for Fear emotion, and the cause of Fear is infertility. As can be seen in this example, the evidence and the causes are not always found within one sentence. The annotation interface, however, ties them together.

Social support needs

According to Cutrona and Suhr’s Social Support Behavior Code21, social support exchanged in different settings can be classified as follows:

  • Informational support (e.g., seeking detailed information or facts)

  • Emotional support (e.g., seeking empathetic, caring, sympathy, encouragement, or prayer support.)

  • Esteem support (e.g., seeking to build confidence, validation, compliments, or relief of pain)

  • Network support (e.g., seeking belonging, companions or network resources).

  • Tangible support (e.g., seeking services)

Examples of the five social support needs are represented in Table 1.

Table 1 Examples of Social Support Needs.

The following aspect of the answers was annotated:

Emotional support in the answer. For each answer, annotators had to read the answer and indicate if it is responding to the emotional/esteem/network/tangible support needs by following:

  • Yes: if the answer is responding to the emotional, esteem, network, or tangible support needs. The answers were not judged on the completeness or quality with respect to the informational needs. The text span that cued the annotator to the positive response was annotated in the answer.

  • No: if the answer is not responding to the emotional, esteem, network, or tangible support needs.

  • Not applicable: if questions only seek informational support needs. Thus, no need for the non-informational aspects of the question to be answered.

Annotator background

The annotation task was completed by 10 annotators (2 male, 7 female, 1 non-binary). As Table 2 shows, the annotators’ ages ranged from 25 to 74 years old and most of them are in the 25–34 and 45–54 brackets. The distribution of ethnicity is 4 White, 3 Asian, 2 Black and 1 Two or more races. In consideration of the diversity, we chose to have annotators from different areas of expertise including biology/genetics, information science/systems, and clinical research. All annotators have a higher educational degree and 60% of them have a doctorate degree. They had a working knowledge of basic emotions and received specific annotation training and guidelines. To measure the annotators’ current state of empathy, State Empathy Scale (SES)22 was conducted by 9 annotators. It captured three dimensions in state empathy of annotators including affective, cognitive, and associative empathy. According to the instrument, the affective empathy presents one’s personal affective reactions to others’ experiences or expressions of emotions. Cognitive empathy refers to adopting others’ perspectives by understanding their circumstances whereas associative empathy encompasses the sense of social bonding with another person. According to the results shown in Table 3, the annotators were generally in a state of high empathy reported as the average of 3.31 on a 5-point Likert scale, ranging from 0 (“not at all”) to 4 (“completely”). The annotators showed higher cognitive empathy than affective or associative empathy (M affective = 3.06, cognitive = 3.64, associative = 3.22). This result indicates the annotators were capable of ensuring their emotions did not intervene in annotating others’ emotions, and their perception was based on the context described in the medical questions. Table 4 shows descriptive data including mean, standard deviation, confidence interval for the state empathy scale items

Table 2 Demographic information of annotators.
Table 3 State Empathy Scale (SES)22 (n = 9).
Table 4 Descriptive Data including Mean, Standard Deviation (SD), Confidence Interval for the State Empathy Scale items.

Inter-rater agreement

To measure inter-annotator agreement (IAA), we sampled 129 questions from the whole collection annotated by three annotators and asked three additional different annotators to annotate the same questions. IAA is calculated using overall agreement. Table 5 shows the overall agreement for emotional states and support needs in the CHQ-SocioEmo dataset. We first looked at the per-emotion IAA and found that sadness, fear, confusion, and anticipation had the lowest inter-annotator agreement, with overall agreement less than 75%. Joy, trust, surprise, disgust, and denial elicited a higher level of agreement, with overall agreement 75% or higher. We also looked at agreement for each category of the social support needs and found that, all categories had substantial agreement, but for the emotional support that had lower overall agreement (57.36%). This is an open-ended task, and the perception is defined by the disparate backgrounds and emotional make-up, therefore we expected moderate agreement as in the other open-ended tasks, such as MEDLINE indexing23.

Table 5 Overall agreement for emotional states and support needs in the CHQ-SocioEmo dataset.


Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright © All rights reserved. | Newsphere by AF themes.