





Are you excited about using computational biology to better understand how cancer evolves after treatment? As an intern in our Discovery Data Science team, you will contribute to building a single-cell cancer landscape of treated patients. While single-cell RNA sequencing (scRNA-seq) has been widely used to map tumor biology, most published studies focus on treatment-naïve samples. In contrast, patients entering experimental therapies are often heavily pre-treated, and their tumors may differ significantly at the cellular and molecular level. In this project, you will use large language models (LLMs) and internal harmonized single-cell datasets to identify treated patient samples across studies. You will then harmonize and integrate these datasets to generate a unified single-cell landscape of treated tumors. Using established computational methods (such as SCVI), you will explore treatment-associated biology through analyses such as cell type abundance (using Milo package) , trajectory and pseudotime modeling (Slingshot, PAGA and Monocle packages ), gene network analysis, and cell–cell communication inference (Cell Phone DB and CellChat packages) . There are multiple directions this project can take depending on your background and interests, meaning you will have the opportunity to shape the focus in line with your skills and your university’s requirements. Ultimately, your work will contribute to a deeper understanding of tumor biology after treatment and help inform data-driven target discovery strategies in oncology!