My lab’s research lies at the intersection of computational linguistics, social media mining, applied machine learning, public health and medicine. We work with natural language data from various sources–social media, published medical literature and electronic health records. We design and develop solutions for both big data and small data problems associated with health. 

Our research is currently funded by the National Institute on Drug Abuse of the National Institutes of Health, the Robert Wood Johnson Foundation and internal funding from the Emory University School of Medicine.

The following is a brief description of our current and past projects.

See the  Publications page for a list of publications.

We  also publish code via bitbucket and data/resources (e.g., via mendeley)

Mining social network postings for monitoring prescription medication abuse

Prescription medication abuse and overdose are the fastest growing drug-related problem in the USA. The growing nature of this problem necessitates the implementation of improved monitoring strategies for investigating the prevalence and patterns of abuse of specific medications. The primary aims are to assess the possibility of utilizing social media as a resource for automatic monitoring of prescription medication abuse and to devise automatic techniques for identifying and assessing the extent of abuse of various prescription medications. This project is funded by NIDA/NIH


Helping state Medicaid agencies explore strategies to better understand and engage with eligible and enrolled populations

This research project attempts to fill an important gap in understanding how or if Medicaid officials engage with enrollees or eligible enrollees in decision-making. Part of the project focuses specifically on the use of social media as an engagement technique for state Medicaid officials.

Further details about this project will be added at a later date.

Text summarization for evidence-based medicine

The goal of this project is to develop algorithms for performing query-focused text summarization of published medical literature for evidence-based medicine. The specific tasks involve: (i) automated classification of the qualities of medical evidence, (ii) single-document, query-focused, extractive summarization, and (iii) multi-document summarization via sentence-level polarity classification. The following diagram illustrates all the tasks and the entire workflow of the developed system.

Screen Shot 2017-09-01 at 9.08.22 PM.png