Publications (selected)

An updated list should be on my Google Scholar page (google typically does a better job than me).

2019

A Sarker, A DeRoos, J Perrone. Mining social media for prescription medication abuse monitoring: a review and proposal for a data-centric frameworkJournal of the American Medical Informatics Association. doi: 10.1093/jamia/ocz162.

AZ Klein, A Sarker, D Weissenbacher, G Gonzalez-Hernandez. Towards scaling Twitter for digital epidemiology of birth defects. npj Digital Medicine. 2, Article number: 96 (2019). doi: 10.1038/s41746-019-0170-5.

A Sarker, AZ Klein, J Mee, P Harik, G Gonzalez-Hernandez. An interpretable natural language processing system for written medical examination assessmentJournal of Biomedical Informatics. Volume 98. October 2019, 103268. https://doi.org/10.1016/j.jbi.2019.103268

A Dirkson, S Verberne, A Sarker, W Kraaij. Data-Driven Lexical Normalization for Medical Social Media. Multimodal Technologies Interact. 20193(3), 60; https://doi.org/10.3390/mti3030060.

A Sarker, G Gonzalez-Hernandez, F DeRoos, J Perrone. Towards real-time opioid abuse surveillance: machine learning for automatic characterization of opioid-related tweets. 39th International Congress of the European Association of Poisons Centres and Clinical Toxicologists (EAPCCT) 21-24 May 2019, Naples, Italy. Clinical Toxicology. Volume 57. Issue 6. Number: 114. [Abstract]. Page: 475. DOI: 10.1080/15563650.2019.1598646.

A Sarker, G Gonzalez-Hernandez, F DeRoos, J Perrone. Towards real-time opioid abuse surveillance: machine learning for automatic characterization of opioid-related tweets. 39th International Congress of the European Association of Poisons Centres and Clinical Toxicologists (EAPCCT) 21-24 May 2019, Naples, Italy. Clinical Toxicology. Volume 57. Issue 6. Number: 113. [Abstract]. Pages: 474-475. DOI: 10.1080/15563650.2019.1598646.

J Love, J Perrone, R Graves, A Sarker. Sentiment, Themes, and Analyses in Tweets about Suboxone. In: ACMT 2019 Annual Scientific Meeting Abstracts (Abstract# 019). Journal of Medical Toxicology. Page: 59. DOI: https://doi.org/10.1007/s13181-019-00699-x.

R Graves, J Perrone, J Love, A Sarker. Can Reddit Reveal Unique Facets of Suboxone Use and Misuse? In: ACMT 2019 Annual Scientific Meeting Abstracts (Abstract# 018). Journal of Medical Toxicology. Pages: 58-59. DOI: https://doi.org/10.1007/s13181-019-00699-x.

A Magge, A Sarker, A Nikfarjam, G Gonzalez-Hernandez. Comment on: “Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts” . Journal of the American Medical Informatics Association. 26 (6), 577-579. [Correspondence/Letter to the Editor].

D Weissenbacher, A Sarker, A Klein, K O’Connor, A Magge, G Gonzalez-Hernandez. Deep neural networks ensemble for detecting medication mentions in tweets. Journal of the American Medical Informatics Association. 2019 Sep 27. pii: ocz156. doi: 10.1093/jamia/ocz156.

AZ Klein, A Sarker, K O’Connor, G Gonzalez-Hernandez. An Analysis of a Twitter Corpus for Training a Medication Intake Classifier. AMIA Jt Summits Transl Sci Proc. 102-106.

A Magge, D Weissenbacher, A Sarker, M Scotch, G Gonzalez-Hernandez. Bi-directional Recurrent Neural Network Models for Geographic Location Extraction in Biomedical Literature. In: Proceedings of the Pacific Symposium on Biocomputing. Volume 24. Pages 100-111.

2018

A Sarker, M Belousov, J Friedrichs, K Hakala, S Kiritchenko, F Mehryary, S Han, T Tran, A Rios, R Kavuluru, B de Bruijn, F Ginter, D Mahata, S Mohammad, G Nenadic, G Gonzalez-Hernandez. Data and systems for medication-related text classification and concept normalization from Twitter: Insights from the Social Media Mining for Health (SMM4H) 2017 shared task. Journal of the American Medical Informatics Association. Volume 25, Issue 10. Pages 1274-1283. [data]

A Sarker, G Gonzalez-Hernandez. An unsupervised and customizable misspelling generator for mining noisy health-related text sourcesJournal of Biomedical Informatics. Volume 88. Pages 98-107.

AZ Klein, A Sarker, H Cai, D Weissenbacher, G Gonzalez-Hernandez. Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare health-related events on TwitterJournal of Biomedical Informatics. Volume 87. Pages 68-78.

D Weissenbacher, A Sarker, MJ Paul, G Gonzalez-Hernandez. Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018. In: Proceedings of the 3rd Social Media Mining for Health Applications (SMM4H) Workshop & Shared TaskPages 13-16.

K Smith, S Golder, A Sarker, Y Loke, K O’Connor, G Gonzalez-Hernandez. Methods to Compare Adverse Events in Twitter to FAERS, Drug Information Databases, and Systematic Reviews: Proof of Concept with Adalimumab. Drug Safety. Volume 41, Issue 12. PAges 1397-1410.

A Magge, D Weissenbacher, A Sarker, M Scotch, G Gonzalez-Hernandez. Deep neural networks and distant supervision for geographic location mention extraction. Bioinformatics 34 (13), i565-i573.

A Sarker, G Gonzalez, FJ DeRoos, LS Nelson, J Perrone. Toxicovigilance through social media: quantifying abuse-indicating information in Twitter data. EAPCCT. 56 (6), 454.

M Rouhizadeh, A Magge, A Klein, A Sarker, G Gonzalez. A Rule-based Approach to Determining Pregnancy Timeframe from Contextual Social Media Postings. In Proceedings of the 2018 International Conference on Digital Health, 16-20.

2017

A Sarker, P Chandrashekar, A Magge, H Cai, AZ Klein, G Gonzalez. Discovering cohorts of pregnant women from social media for safety surveillance and analysis. Journal of Medical Internet Research (JMIR)DOI:10.2196/jmir.8164 [Article in press]. [resources]

A Sarker. A Customizable Pipeline for Social Media Text Normalization. Social Network Analysis and Mining. DOI: 10.1007/s13278-017-0464-z. [resources]

G Gonzalez, A Sarker, K O’Connor, G Savova. Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text. IMIA Yearbook of Medical Informatics. 2017:214-27 http://dx.doi.org/10.15265/IY-2017-029.

A Sarker*, D Weissenbacher*, T Tahsin, M Scotch, G Gonzalez. Extracting geographic locations from the literature for virus phylogeography using supervised and distant supervision methods. Proceedings of AMIA TBI Symposium. 2017. [*equal contribution first authors].

A Sarker, G Gonzalez. HLP@UPenn at SemEval-2017 Task 4A: A simple, self-optimizing text classification system combining dense and sparse vectors. SemEval-2017. 640-643. 2017.

AZ Klein, A Sarker, M Rouhizadeh, K O’Connor, G Gonzalez. Detecting Personal Medication Intake in Twitter: An Annotated Corpus and Baseline Classification System. BioNLP. 2017. Pages 136-142. [resources]

A Sarker, G Gonzalez. A corpus for mining drug-related knowledge from Twitter chatter: Language models and their utilitiesData in Brief Journal. Volume 10. Pages 121-131. DOI: http://dx.doi.org/10.1016/j.dib.2016.11.056. 2017. [resources]

A Sarker, A Magge, A Sharma. Dermatologic Concerns Communicated Through TwitterInternational Journal of Dermatology. doi:10.1111/ijd.13506. 2017.

A Sarker, D Malone, G Gonzalez. Authors’ Reply to Jouanjus and Colleagues’ Comment on “Social Media Mining for Toxicovigilance: Monitoring Prescription Medication Abuse from Twitter”. Drug Safety. Feb;40(2):187-188. doi: 10.1007/s40264-016-0498-6. 2017.

2016

I Korkontzelos, A Nikfarjam, M Shardlow, A Sarker, S Ananiadou, G Gonzalez. Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. Journal of Biomedical Informatics (JBI). Volume 62. Pages 148-158. 2016.

A Sarker, K O’Connor, R Ginn, M Scotch, K Smith, Dan Malone, G Gonzalez. Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter. Drug Safety. Pages 1-10. DOI: 10.1007/s40264-015-0379-4. 2016. [resources]

A Sarker, D Molla, C Paris. Query-oriented evidence extraction to support evidence-based medicine practiceJournal of Biomedical Informatics (JBI). Volume 59. Pages 169-184. DOI: http://dx.doi.org/10.1016/j.jbi.2015.11.010. 2016. [code]

R Sullivan, A Sarker, K O’Connor, A Goodin, M Karlsrud, G Gonzalez. Monitoring nutritional supplements: Challenges and promises of mining user comments for adverse events. Proceedings of the Pacific Symposium on Biocomputing. 2016.

A Sarker, G Gonzalez. DiegoLab16 at Semeval-2016 Task 4: Sentiment Analysis in Twitter using Centroids, Clusters and Sentiment Lexicons. SemEval-16. 214-219. 2016.

MJ Paul, A Sarker, J Brownstein, A Nikfarjam, M Scotch, K Smith, G Gonzalez. Social media mining for public health monitoring and surveillance. Proceedings of the Pacific Symposium on Biocomputing. 2016.

A Sarker, A Nikfarjam, G Gonzalez. Social Media Mining Shared Task Workshop. Proceedings of the Pacific Symposium on Biocomputing. 2016. [description] [task 1 page] [task 2 data]

2015

D Molla, E Santiago-Martinez, A Sarker, C Paris. A Corpus for Research in Text Processing for Evidence Based MedicineJournal of Language Resources and Evaluations. 2015. [resources]

A Sarker, R Ginn, A Nikfarjam, K O’Connor, K Smith, S Jayaraman, T Upadhaya, G Gonzalez. Utilizing social media data for pharmacovigilance: A review. Journal of Biomedical Informatics (JBI). Volume 54. Pages 202-212. DOI: http://dx.doi.org/10.1016/j.jbi.2015.02.004. 2015. [Editor’s choice; Nominated for ATLAS (September, 2015)] [resources]

A Sarker, D Molla, Cecile Paris. Automatic evidence quality prediction to support evidence-based decision making. Artificial Intelligence in Medicine (AIIM). Volume 64. Pages 89-103. DOI: http://dx.doi.org/10.1016/j.artmed.2015.04.001. 2015. [corpus]

A Nikfarjam, A Sarker, K O’Connor, R Ginn, G Gonzalez. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features. Journal of the American Medical Informatics Association (JAMIA). DOI: http://dx.doi.org/10.1093/jamia/ocu041. Pages: 0-11. 2015. [data/resources] [code]

A Sarker, A Nikfarjam, D Weissenbacher, G Gonzalez. DIEGOLab: An Approach for Message-level Sentiment Classification in Twitter. In Proceedings of the 9th International Workshop on Semantic Evaluation (SEMEVAL). Pages: 510-514. 2015.

2014

A Sarker, G Gonzalez. Portable automatic text classification for adverse drug reaction detection via multi-corpus training. Journal of Biomedical Informatics (JBI). Volume 53. Pages: 196-207. DOI: 10.1016/j.jbi.2014.11.002. 2014. [data/resources] [code]

K O’Connor, A Nikfarjam, R Ginn, P Pimpalkhute, A Sarker, K Smith, G Gonzalez. Pharmacovigilance on Twitter? Mining Tweets for Adverse Drug Reactions. In Proceedings of the American Medical Informatics Association Annual Symposium (AMIA). 2014.

R Ginn, P Pimpalkhute, A Nikfarjam, A Patki, K O’Connor, A Sarker, G Gonzlez. Mining Twitter for adverse drug reaction mentions: a corpus classification benchmark. In Proceedings of the Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing (BIOTXTM). 2014. [resources]

D Molla, C Jones, A Sarker. Impact of citing papers for summarisation of clinical documents. In Proceedings of the Australasian Language Technology Association (ALTA) Workshop. Pages: 79-87. 2014.

A Patki, A Sarker, P Pimpalkhute, A Nikfarjam, R Ginn, K O’Connor, K Smith, G Gonzalez. Mining Adverse Drug Reaction Signals from Social Media: Going Beyond Extraction. In Proceedings of BiolinkSIG. 2014.

A Sarker. Automated Medical Text Summarisation to Support Evidence-based Medicine. Ph.D Thesis. Macquarie University. 2014.

2013

A Sarker, D Molla, C Paris. An Approach for Query-Focused Text Summarisation for Evidence Based Medicine. Artificial Intelligence in Medicine (AIME). Lecture Notes in Computer Science. Volume: 7885. Publisher: Springer Berlin Heidelberg. Pages: 295-304. 2013.

A Sarker, D Molla, C Paris. Automatic Prediction of Evidence-based Recommendations via Sentence-level Polarity Classification. In Proceedings of the International Joint Conference on Natural Language Processing. Pages: 712-718. 2013.

A Sarker, D Molla, C Paris. An Approach for Automatic Multi-label Classification of Medical Sentences. In Proceedings of the 4th International Louhi Workshop on Health Document Text Mining and Information Analysis (LOUHI). 2013.

2012 and earlier 

A Sarker, D Molla, C Paris. Extractive evidence based medicine summarisation based on sentence-specific statistics. In proceedings of Computer Based Medical Systems (CBMS). 2012.

A Sarker, D Molla, C Paris. Towards two-step multi-document summarisation for evidence based medicine: a quantitative analysis. In Proceedings of the Australasian Language Technology Association (ALTA) Workshop. Pages: 79-87. 2012.

A Sarker, D Molla, C Paris. Outcome Polarity Identification of Medical Papers. In Proceedings of the Australasian Language Technology Association (ALTA) Workshop. Pages: 105-114. 2011.

A Sarker, D Molla, C Paris. Towards Automatic Grading of Evidence. In Proceedings of the 4th International Louhi Workshop on Health Document Text Mining and Information Analysis (LOUHI). Pages: 51-58. 2011.

D Molla, A Sarker. Automatic grading of evidence: the 2011 ALTA shared task. In Proceedings of the Autralasian Language Technology Association (ALTA) Workshop. Pages: 4-8. 2011.

A Sarker, D Molla. A rule based approach for automatic identification of publication types of medical papers. In Proceedings of the Australasian Document Computing Symposium (ADCS). 2010.

A Sarker, LGC Hamey. Improved reconstruction of flutter shutter images for motion blur reduction. In proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA). Pages: 317-422. DOI: 10.1109/DICTA.2010.77. 2010. [BEST PAPER PRIZE]

A Sarker. Motion Blur Reduction from Captured Images.Macquarie University. 2010. [UNIVERSITY MEDAL RECIPIENT].

Unpublished (in progress):

Also working on: 

domain adaptation techniques for social media based text normalization (i.e., efficient normalization of medical tweets)

– deep learning based concept similarity measurements for umls

Please email me at: abeed@upenn.edu if you want to know more about these research tasks in progress.