Sarah Falls

Biomedical Informatics

Specializing In:

Prediction, machine and deep learning Patient subtype extraction and validation Natural language processing (entity extraction and normalization)

About Sarah Falls


I joined the faculty of the Biostatistics & Bioinformatics Department at Roswell Park Comprehensive Cancer Center as an Assistant Professor of Oncology in October 2022.

I completed a postdoctoral fellowship at the Yale Center for Medical Informatics after earning my PhD in Biomedical Informatics from the SUNY University at Buffalo (UB) and a MS in Statistics from the Ohio State University. In between my degrees, I worked in rehabilitation medicine and clinical psychology as a data manager and statistician. I also earned a BA degree in Mathematics and Statistics and English Literature from Canisius College.

My research focuses on modeling information contained in the electronic health record and health data repositories. I am primarily interested in using machine and deep learning in conjunction with terminological and ontological information to pursue precision medicine, focusing on the use of patient-specific data combined with evidence-based medicine to tailor prediction and treatment to an individual. I have a special interest in using multiple sources of data and heterogenous data formats (free text and structured data elements) to improve outcomes and modeling. When using free text, I have focused primarily on entity extraction and normalization to ontologies and terminologies and relation extraction.

I'm passionate about creating useful tools and algorithms with an interdisciplinary team of clinicians, health care professionals, patients, and data scientists that will help to eradicate cancer and improve the patient and clinician experience.


Roswell Park Comprehensive Cancer Center
  • Assistant Professor of Oncology
  • Department of Biostatistics and Bioinformatics


Education and Training:

  • 2021 - PhD - Biomedical Informatics, Clinical Informatics, University at Buffalo, Buffalo, NY
  • 2013 - MS - Statistics, Ohio State University, Columbus, OH


  • 2021-2022 - Postdoctoral Fellowship, Yale Center for Medical Informatics, Yale University, New Haven, CT

Honors & Awards:

  • 2022 - Yale-Mayo Clinic Center of Excellence in Regulatory Science and Innovation (CERSI) Scholars Award
  • 2020-2021 - CTSI Pilot Study Program Funding for ‘Prediction of opioid-related drug combinations using CANDO and a WNY retrospective cohort.'
  • 2017-2020 - NIH NLM T15 Training, State University of New York at Buffalo, Buffalo, NY
  • 2017 - Best Student Paper Award for Secondary Use of EHR: Interpreting Clinician Inter-Rater Reliability Through Qualitative Assessment at the Context Sensitive Health Informatics Conference, Hong Kong
  • 2013 Thomas E and Jean D Powers Award for Outstanding Teaching Associate, The Ohio State University, Columbus, OH


Full Publications list on PubMed

Mullin S, Wyk BV, Asher JL, Compton SR, Allore HG, Zeiss CJ. Modeling pandemic to endemic patterns of SARS-CoV-2 transmission using parameters estimated from animal model data. PNAS nexus. 2022 Jul;1(3):pgac096.

Elkin PL, Mullin S, Tetewsky S, Resendez SD, McCray W, Barbi J, Yendamuri S. Identification of Patient Characteristics Associated with Survival Benefit from Metformin treatment in Stage I NSCLC. The Journal of Thoracic and Cardiovascular Surgery. 2022 Mar 10.

Elkin PL, Mullin S, Mardekian J, Crowner C, Sakilay S, Sinha S, Brady G, Wright M, Nolen K, Trainer J, Koppel R. Using Artificial Intelligence With Natural Language Processing to Combine Electronic Health Record’s Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study. Journal of Medical Internet Research. 2021 Nov 9;23(11):e28946.

Mullin S, Zola J, Lee R, Hu J, MacKenzie B, Brickman A, Anaya G, Sinha S, Li A, Elkin PL. Longitudinal K-means approaches to clustering and analyzing EHR opioid use trajectories for clinical subtypes. Journal of Biomedical Informatics. 2021 Aug 16:103889.

Mullin S, Elkin P. Assessing Opioid Use Patient Representations and Subtypes. Studies in health technology and informatics. 2020 Jun 1;270:823-7.