Turning to the internet for health-related information has become commonplace—from using search engines to uncover possible diagnoses based on symptoms to asking for advice on social media when dealing with health issues.
“When people are diagnosed with a health condition or starting a new treatment, they often turn to peers on social media for emotional support and information,” says Sarah Preum, an assistant professor of computer science.
Health-related social media discussion groups are an invaluable repository where users from diverse demographics spontaneously share lived-in experiences and challenges, says Preum.
Her research employs natural language processing—artificial intelligence techniques focused on human linguistics—to extract meaningful information from health-related social media interactions.
In their ongoing project, Preum and her collaborators developed an AI framework to analyze Reddit posts from a forum where users discuss their experience related to opioid use disorder treatment that uses FDA-approved medications.
“Social media is a goldmine for understanding patient perspectives that you might not hear in a clinical setting,” says Jacob Borodovsky, senior research scientist and epidemiologist at the Center for Technology and Behavioral Health, who was drawn to the study because it applies relatively unorthodox data and methods to work on substance use research.
“Plus, Reddit is anonymous, which can lead to very open conversations. This type of naturalistic, unfiltered data can be incredibly insightful,” says Borodovsky.
Preum and her colleagues developed a model that identifies and categorizes information-seeking Reddit posts based on events—broad categories that define the trajectory of the treatment—created in consultation with domain experts Borodovsky and Sarah Lord, clinical-developmental psychologist and associate professor of psychiatry, biomedical data sciences, and pediatrics at the Geisel School of Medicine.
To train the AI model, a group of researchers created a new dataset of information-seeking posts that they manually categorized into events.
“I take 1-2 mg subs per day which is a decrease from the original dose of 8 mg. Just looking for a plan of action in which to stick with to eventually get off completely,” queries one user who is ready to taper treatment.
“When I run out of my Suboxone prematurely, I like to keep Kratom on hand for my extremely low energy and excessive yawning,” shares another.
Researchers tagged the former post with “Taking MOUD” (medications for opioid use disorder) and “Tapering” events, while the latter was filed under “Relapse” and “Psychophysical Effects” events.
AI models trained on this dataset then scan several thousand posts and divide them based on events. The researchers evaluated the performances of a few different models, some trained on their dataset and others that were not.
The study found that domain expertise is needed to help annotate the training data, and there is still room for optimizing such AI models. For categorizing such large-scale, complex health data more effectively, we need a human-in-the-loop system, says Omar Sharif, a computer science PhD student at the Guarini School of Graduate and Advanced Studies, who is part of the team that built the AI model.
“Our studies show that analyzing health-related conversations in social discourse through the lens of events is more effective, objective, and generalizable than existing techniques,” says Preum. “Ultimately, our method can automatically map the information needs to critical stages of treatments.”
Identifying recurring topics and understanding what kind of information is commonly exchanged offers a unique insight into the experiences and knowledge gaps among people who are being treated for opioid use disorders, says computer science PhD student Madhusudan Basak, who worked with Sharif and Preum on the project.
The analyses sift through peer exchanges to reveal challenges in adhering to treatment regimens and mechanisms that recovering patients employ to cope with side effects and relapses. They also reveal false and potentially harmful information, including misinformation that is propagated via such channels.
Such information mined from social media, says Lord, who was part of one of the first research teams to conduct an online survey study of prescription opioid and stimulant misuse among college students in the early days of Facebook, can be used in all kinds of innovative ways, informing directions to improve communication between clinicians and patients and empowering patients to connect to care.
“With tremendous capabilities of machine learning, social media platforms offer exciting opportunities for identifying diverse treatment needs and reaching people in their own virtual communities,” Lord says.