Kumar Thurimella

Kumar Thurimella, PhD

Incoming Internal Medicine Resident · Physician-Scientist in Training

Stanford Health Care (Incoming) · University of Colorado School of Medicine · University of Cambridge

About

I am a final-year MD student at the University of Colorado School of Medicine and recently completed my PhD at the University of Cambridge. This summer, I will begin Internal Medicine residency at Stanford Health Care on the ABIM Research Pathway (Short Track), with a guaranteed Rheumatology fellowship at Stanford to follow. Before medical school, I worked as a software engineer at Uber, where I first saw the potential of computation to find precise signals within vast, complex systems. That conviction — that algorithms could help unravel the mechanisms behind autoimmune disease — led me to pursue a career as a physician-scientist.

My work sits at the intersection of AI, computational biology, and clinical medicine. I am drawn to rheumatology, where immune dysregulation, complex patient data, and computational modeling converge with the opportunity to develop targeted therapies for patients who need them most.

Current Work

My PhD in Biotechnology and Statistics/Deep Learning focuses on using protein lanugage models to discover new microbial enzymes linked in immune mediated diseases. I was advised by Dr. Sergio Bacallado at Cambridge and Dr. Ramnik Xavier at the Broad Institute and Mass General Hospital.

My newest paper, Identifying microbial protease allergens through protein language model-guided homology, was just published in Cell Systems (2026). It introduces a deep learning framework using protein language models to uncover candidate allergenic serine proteases across gut and oral microbiome gene catalogs. The work was recently profiled by the Cambridge Department of Chemical Engineering and Biotechnology. Other recent work includes CAZyLingua, a protein language model-based tool for annotating carbohydrate-active enzymes in metagenomics (BMC Bioinformatics, 2025).

I am grateful to the Gates Cambridge Scholarship and Rotary International Scholarship for funding my PhD.

Medical Education

6/6
Core Clerkship Honors
AOA
Alpha Omega Alpha
Rheumatology Interest

At the University of Colorado School of Medicine, I earned Honors in all six core clinical clerkships — Internal Medicine, OB/GYN, Pediatrics, Family Medicine, Psychiatry, and Surgery — as well as in my Rheumatology and Medicine Acting Internship rotations. I was elected to Alpha Omega Alpha (AOA), the national medical honor society.

I will continue my clinical training at Stanford Health Care, where I matched into Internal Medicine on the ABIM Research Pathway (Short Track), beginning June 2026. The pathway leads directly into a guaranteed Rheumatology fellowship at Stanford starting in 2028, allowing me to integrate rigorous clinical training with protected research time as I build a career as a physician-scientist focused on autoimmune disease.

Education & Experience

Internal Medicine Residency, ABIM Research Pathway

Jun 2026 – Jun 2028

Stanford Health Care

Short Track · Guaranteed Rheumatology Fellowship at Stanford, Jul 2028–

PhD, Biotechnology & Mathematics/Statistics

Oct 2020 – Apr 2024

University of Cambridge

Gates Cambridge Scholar

MD, Expected May 2026

2018–2026

University of Colorado School of Medicine

Honors in 6/6 core clerkships · AOA

MPhil, Computational Biology

2017–2018

Wellcome Sanger Institute / University of Cambridge

Software Engineer II

2015–2017

Uber

San Francisco, CA

Junior Software Developer

2014–2015

ThoughtWorks

San Francisco, CA

BS, Applied Mathematics

2009–2013

University of Colorado Boulder

Magna Cum Laude · Advisor: Rob Knight PhD

Selected Papers

Cell Systems · 2026
Paper

Identifying microbial protease allergens through protein language model-guided homology

A deep learning framework using protein language models to identify candidate allergenic serine proteases across gut and oral microbiome gene catalogs.

Thurimella, K., Wu, E., Li, C., Graham, D. B., Owens, R. M., Plichta, D. R., Sokol, C. L., Xavier, R. J., & Bacallado, S. (2026). Identifying microbial protease allergens through protein language model-guided homology. Cell Systems, 0, 101510.

BMC Bioinformatics · 2025
Paper

Protein language models uncover carbohydrate-active enzyme function in metagenomics

CAZyLingua is the first annotation tool to use protein language models for accurate classification of carbohydrate-active enzyme families and subfamilies in metagenomics.

Thurimella, K., Mohamed, A. M., Li, C., Vatanen, T., Graham, D. B., Owens, R. M., La Rosa, S. L., Plichta, D. R., Bacallado, S., & Xavier, R. J. (2025). Protein language models uncover carbohydrate-active enzyme function in metagenomics. BMC Bioinformatics, 26(285).

JMIR Cardio (Under Review) · 2024

Pilot randomized controlled trial: Increased engagement in cardiovascular health through AI-enabled tailored messaging

A pilot randomized controlled trial showing that AI-enabled tailored messaging increases engagement in cardiovascular health interventions.

Xia, A.†, Thurimella, K.†, Bull, S., Waughtal, J., Chavez, C., Novins-Montague, S., Silvasstar, J., Salyers, A., Ho, M. P., & Lavieri, M. (2024). Pilot randomized controlled trial: Increased engagement in cardiovascular health through AI-enabled tailored messaging. JMIR Cardio (Under Review). †Co-first authors

Molecular Ecology Resources · 2023
Paper

SCNIC: Sparse Correlation Network Investigation for Compositional Data

SCNIC is open-source software that can generate correlation networks and detect and summarize modules of highly correlated features.

Shaffer, M.†, Thurimella, K.†, Sterrett, J. D., & Lozupone, C. A. (2023). SCNIC: Sparse Correlation Network Investigation for Compositional Data. Molecular Ecology Resources, 23(1), 312–325. †Co-first authors