skip to content

Summer Research Programmes

 

2026 Academic CMP Projects

Below you will find the list of academic CMP projects hosted by other departments and labs within Cambridge university (jump to list).  Click here to see the list of projects hosted by external companies.  

New projects may be added throughout Lent term so be sure to check back regularly!

 

How to Apply

Unless alternative instructions are given in the project listing, to apply for a project you should send your CV to the contact provided along with a covering email which explains why you are interested in the project and why you think you would be a good fit.  

Need help preparing a CV or advice on how to write a good covering email? 

The Careers Service are there to help!  Their CV and applications guides are packed full of top tips and example CVs.  

Looking for advice on applying for CMP projects specifically?  Check out this advice from CMP Co-Founder and Cambridge Maths Alumnus James Bridgwater.  

Remember: it’s better to put the work into making fewer but stronger applications tailored to a specific project than firing off a very generic application for all projects – you won’t stand out with the latter approach!  

Please note that to participate in the CMP programme you must be a student in Part IB, Part II, or Part III of the Mathematical Tripos at Cambridge.  

 

Want to know more about a project before you apply? 

Come along to the CMP Lunchtime Seminar Series in February 2026 to hear the hosts give a short presentation about their project.  There will be an opportunity afterwards for you to chat informally with hosts about their projects. 

Alternatively (or as well!), you can reach out to the contact given in the project listing to ask questions. 

 


Academic CMP Project Proposals for Summer 2026

 

Denoising and optimising neural data in cochlear implant users

Project Title Denoising and optimising neural data in cochlear implant users
Keywords auditory neuroscience, hearing technology, neuro-prosthetics, cochlear implants, non-linear optimization, sequential quadratic programming, neural modelling, biomedical devices, denoising
Project Listed 9 January 2026
Project Status Open
Application Deadline 18 February 2026
Project Supervisor Charlotte Garcia
Contact Name Charlotte Garcia
Contact Email charlotte.garcia@mrc-cbu.cam.ac.uk
Company/Lab/Department MRC Cognition & Brain Sciences Unit
Address 15 Chaucer Road, Cambridge, CB27EF
Project Duration 8 weeks full-time (exact timing to be co-ordinated with a successful candidate)
Project Open to Masters students (Part III), Third year undergraduates (Part II)
Background Information

There are a few kinds of hearing technologies that help people with hearing loss to hear. These include hearing aids that acoustically amplify sound entering someone’s ear. While hearing aids are normally prescribed and programmed by a healthcare professional, many over-the-counter hearing aids have recently become available as well, and even Air Pods contain hearing-assistive programs. However, these types of devices do not help those with more severe hearing loss. A more complex hearing device involves bypassing the outermost parts of the auditory system and are more like bionic ears: these are cochlear implants.

A cochlear implant is neural prosthetic that provide people with a profound hearing impairment with auditory perception by directly electrically stimulating the auditory nerve. It requires a surgical operation wherein a small string of electrodes is inserted into the patient’s inner ear. These electrodes are then controlled using a speech processor with a microphone that sits behind the patient’s ear. Most recipients can perceive speech well with their implant, and some enjoy music. They are arguably the most successful auditory prosthetic in existence today, with over 1 million users globally.

However, many struggle to understand speech with their implants, especially in challenging listening conditions with background noise. This may be due to the fact that individual cochlear implant users lose their hearing for different reasons, and the cochlear implant software and settings are not optimized for their unique pattern of hearing loss. We have developed a diagnostic tool called the Panoramic ECAP method that is designed to provide patient-specific indicators of the interaction between a patient’s implant and their auditory system. It involves applying a non-linear optimisation algorithm to electrophysiological measurements of neural activity in the cochlea. These measurements are called ‘Electrically Evoked Compound Action Potentials’ or ‘ECAPs’ for short. It is our hope that this tool can be used in a clinical setting to personalise and optimize cochlear-implant software and enable patients to achieve their hearing potential with their device.

Project Description

The Panoramic ECAP (PECAP) method provides two estimates that describe the interaction between a cochlear implant patient’s inner ear (a.k.a. cochlea [1]) and their implant: the health of the auditory nerves and the spread of electrical current. The two PECAP estimates are quantified for each electrode of the cochlear implant. Areas of poor neural responsiveness or wide current spread reduce the efficiency of delivering auditory information from the neuroprosthetic device to the brain. The PECAP method estimates the neural responsiveness and the current spread at each electrode separately, using a nonlinear optimisation algorithm based on sequential quadratic programming whose structure is based on a theoretical framework [2]. We have conducted experiments to validate that these estimates are accurate and indeed separate from each other. While the results of our previous work are encouraging for clinical translation, long neural-response recording times and noisiness of some recorded data are barriers to translating this research to the clinic.

This project will have two facets that must be balanced with each other:

  1. Building on a previous CMP student's work, determine if we can record less neural data (i.e. requiring a shorter amount of time to record) to accurately reproduce the PECAP algorithm's current-spread and neural-responsiveness estimates
  2. Evaluate the potential and effectiveness of a recently-published denoising technique [3] to reduce measurement noise and thereby increase the percentage of cochlear-implant patents with whom the PECAP technique can be used (i.e. in cases of noisy neural data).

Two previous CMP students - Taren Rughooputh (2018) and Scott Hislop (2023) - conducted similar projects in our laboratory in the past whose work contributed to academic publications, for which they are on the author lists. (Rughooputh [1], Hislop [Garcia, et al, 2024, DOI: 10.1007/s10162-024-00966-x])

References

[1] Background Coursework from Duke University describing the peripheral auditory system:
https://gb.coursera.org/lecture/medical-neuroscience/peripheral-auditory-mechanisms-part-1-SQ6H7.
I recommend watching the first 3 videos of week 7 for a review of the human auditory system

[2] The publication describing the ‘PECAP’ algorithm that this project is based on:
Garcia, C., Goehring, T., Cosentino, S. et al. The Panoramic ECAP Method: Estimating Patient-Specific Patterns of Current Spread and Neural Health in Cochlear Implant Users. JARO 22, 567–589 (2021).
https://doi.org/10.1007/s10162-021-00795-2

[3] The publication describing the de-noising technique:
Kung F. J. (2025). An Integrated Spatial-Spectral Denoising Framework for Robust Electrically Evoked Compound Action Potential Enhancement and Auditory Parameter Estimation. Sensors (Basel, Switzerland), 25(11), 3523.
https://doi.org/10.3390/s25113523

Work Environment For this project, you will be working within a lab with flexible, full-time working hours. You will have your own individual project stream that will be primarily supervised by Research Fellow Charlotte Garcia with whom you will have weekly meetings. There are a number of post-docs in the lab as well who you will be able to interact with. You will also attend weekly lab meetings (lead by our PI, Bob Carlyon) at the MRC Cognition and Brain Sciences Unit (1), which is part of the larger Cambridge Hearing Group (2). Toward the end of the project, you will have the opportunity to present your project work at one of these meetings.
(1) MRC Cognition & Brain Sciences Unit Website: https://www.mrc-cbu.cam.ac.uk/
(2) Cambridge Hearing Group Website: https://www.hearing-research.group.cam.ac.uk/
Prerequisite Skills Statistics, Numerical Analysis, Data Visualisation, basic programming skills in MATLAB or python are required
Other skills used in the Project Simulation, Predictive Modelling
Acceptable Programming Languages Python, MATLAB
Additional Requirements Ability to communicate mathematical concepts with a non-expert audience, interest in medical devices and/or sound, enthusiasm
Application Instructions To apply for this project, please submit a CV via email. In your email, please include one paragraph (max 200 words) demonstrating why you are interested in the topic, what you think they would bring to the project, and why you think you are a good fit.

 

GPU-Accelerated Detection of Subthreshold Gravitational Wave Signals

Project Title GPU-Accelerated Detection of Subthreshold Gravitational Wave Signals
Keywords Gravitational waves, Bayesian inference, nested sampling, GPU computing, multi-messenger astronomy
Project Listed 9 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Will Handley
Contact Name Will Handley
Contact Email wh260@cam.ac.uk
Company/Lab/Department Institute of Astronomy, University of Cambridge
Address Institute of Astronomy, Madingley Road, Cambridge CB3 0HA
Project Duration 8 weeks, full-time, June-September 2026
Project Open to Masters students (Part III), Third year undergraduates (Part II)
Background Information

Since the landmark detection of GW170817, gravitational wave astronomers have been searching for additional "bright sirens" - gravitational wave events with electromagnetic counterparts. However, LIGO's frequentist methodology using false alarm rates may have missed signals buried just below detection thresholds. These subthreshold signals could provide crucial cosmological information, including independent measurements of the Hubble constant to address the Hubble tension.

Bayesian model comparison offers a principled framework for assessing whether multi-messenger information (such as gamma-ray burst sky localizations) can help identify gravitational wave signals that would otherwise be classified as noise. Recent advances in GPU-accelerated inference using JAX-based tools like BlackJAX and JimGW make this computationally tractable.

Project Description

The student will develop GPU-accelerated nested sampling pipelines to search for subthreshold gravitational wave signals by conditioning on sky localizations from gamma-ray bursts. Specific tasks include:

  1. Learning the fundamentals of gravitational wave data analysis and Bayesian nested sampling
  2. Implementing GPU-accelerated inference using JAX, BlackJAX, and JimGW 
  3. Generating realistic mock gravitational wave data using BILBY and injecting signals at various signal-to-noise ratios
  4. Comparing Bayesian evidence between unconditioned and sky-conditioned models
  5. Applying the pipeline to real LIGO data coincident with short gamma-ray bursts from the Fermi GRB catalogue

A successful outcome would be a validated pipeline capable of identifying candidate subthreshold signals and quantifying the evidential support for their astrophysical origin. The project uses mathematical skills in probability theory, Bayesian inference, and high-dimensional integration (nested sampling). Programming skills in Python and familiarity with numerical computing are essential.

References 1. Abbott et al. (2017) "GW170817: Observation of Gravitational Waves from a Binary Neutron Star Inspiral" Physical Review Letters 119, 161101
2. Skilling (2006) "Nested Sampling for General Bayesian Computation" Bayesian Analysis 1, 833
3. Ashton et al. (2019) "BILBY: A user-friendly Bayesian inference library" ApJS 241, 27
4. Wong et al. (2023) "JimGW: A JAX-based gravitational wave inference library" https://github.com/kazewong/jim
5. Cabezas et al. (2024) "BlackJAX: Composable Bayesian inference in JAX" arXiv:2402.10797
Work Environment The student will work within the Handley research group at the Institute of Astronomy, joining a team of approximately 10 PhD students and postdocs working on Bayesian inference methods in astrophysics and cosmology. They will have a dedicated desk in the IoA and access to GPU computing resources. Weekly group meetings and one-to-one supervision sessions will be held. Standard office hours are expected, with flexibility for remote work as appropriate. The group maintains an active Slack workspace for day-to-day communication.
Prerequisite Skills Statistics, Probability / Markov Chains, Numerical Analysis
Other skills used in the Project Mathematical Physics, Simulation, Data Visualisation
Acceptable Programming Languages Python
Additional Requirements Enthusiasm for learning new computational methods; willingness to engage with challenging mathematical and physical concepts; good communication skills for presenting results; ability to work both independently and as part of a research team.
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

GPU-Accelerated Hubble Constant Inference from Gravitational Waves

Project Title GPU-Accelerated Hubble Constant Inference from Gravitational Waves
Keywords Hubble constant, gravitational waves, Bayesian inference, GPU computing, cosmology
Project Listed 9 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Will Handley
Contact Name Will Handley
Contact Email wh260@cam.ac.uk
Company/Lab/Department Institute of Astronomy, University of Cambridge
Address Institute of Astronomy, Madingley Road, Cambridge CB3 0HA
Project Duration 8 weeks, full-time, June-September 2026
Project Open to Masters students (Part III), Third year undergraduates (Part II)
Background Information

The Hubble tension - the 4-6σ disagreement between early-universe (CMB) and late-universe (supernovae) measurements of the cosmic expansion rate H₀ - is one of the most pressing problems in modern cosmology. Gravitational wave "standard sirens" offer a completely independent method to measure H₀, as the luminosity distance is encoded directly in the gravitational wave signal amplitude.

GW170817, the first binary neutron star merger with an electromagnetic counterpart, provided the first standard siren measurement of H₀. However, standard parameter estimation pipelines require thousands of CPU hours, limiting rapid multi-messenger follow-up. GPU-accelerated inference using JAX can reduce this to minutes, enabling real-time cosmological constraints from future detections.

Project Description

The student will develop and validate a GPU-accelerated pipeline for joint inference of gravitational wave source parameters and the Hubble constant. Specific tasks include:

  1. Understanding the theoretical framework connecting gravitational wave observations to cosmological distance measurements
  2. Implementing joint parameter estimation for GW source properties and H₀ using JAX-based tools
  3. Validating the pipeline on the benchmark event GW150914
  4. Reproducing the H₀ posterior from GW170817 and comparing with published LVK collaboration results
  5. Investigating prior sensitivity by comparing direct sampling approaches with reweighting techniques
  6. Benchmarking GPU vs CPU performance

A successful outcome would be a validated, fast pipeline for H₀ inference that reproduces published results while providing insight into statistical methodology. The project uses mathematical skills in Bayesian inference, cosmology, and high-dimensional sampling. Strong programming skills in Python are essential.

References 1. Abbott et al. (2017) "A gravitational-wave standard siren measurement of the Hubble constant" Nature 551, 85
2. Schutz (1986) "Determining the Hubble constant from gravitational wave observations" Nature 323, 310
3. Riess et al. (2022) "A Comprehensive Measurement of the Local Value of the Hubble Constant" ApJ 934, L7
4. Planck Collaboration (2020) "Planck 2018 results. VI. Cosmological parameters" A&A 641, A6
5. Wong et al. (2023) "JimGW: A JAX-based gravitational wave inference library" https://github.com/kazewong/jim
Work Environment The student will work within the Handley research group at the Institute of Astronomy, joining a team of approximately 10 PhD students and postdocs working on Bayesian inference methods in astrophysics and cosmology. They will have a dedicated desk in the IoA and access to GPU computing resources. Weekly group meetings and one-to-one supervision sessions will be held. Standard office hours are expected, with flexibility for remote work as appropriate. The group maintains an active Slack workspace for day-to-day communication.
Prerequisite Skills Statistics, Probability / Markov Chains, Numerical Analysis
Other skills used in the Project Mathematical Physics, Simulation, Data Visualisation
Acceptable Programming Languages Python
Additional Requirements Enthusiasm for learning new computational methods; willingness to engage with challenging mathematical and physical concepts; good communication skills for presenting results; ability to work both independently and as part of a research team.
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Acquisition of foreign DNA by transmissible cancer genomes

Project Title Acquisition of foreign DNA by transmissible cancer genomes
Keywords Genomics, evolution, viruses, sequencing, cancer
Project Listed 9 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Professor Elizabeth Murchison and Dr Kevin Gori
Contact Name Kevin Gori
Contact Email kcg25@cam.ac.uk
Company/Lab/Department Department of Veterinary Medicine
Address Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, CB3 0ES
Project Duration 8 weeks between June and September 2026
Project Open to Masters students (Part III)
Background Information

Transmissible cancers are infectious tumours that spread through animal populations as clonal allografts [1]. Three are known to affect mammals: Canine Transmissible Venereal Tumour (CTVT) [2], and two forms of Tasmanian Devil Facial Tumour disease (DFT1 and DFT2) [3,4]. Several more forms have been discovered as disseminated neoplasias in marine bivalves [5].

One common feature of transmissible cancers is that they have outlived the progenitor that founded them. Subsequently they have been exposed to new, temporary hosts of diverse genotype and microbiota. DNA from one such host has been permanently incorporated into a transmissible cancer genome on at least one occasion [6].

However, we don’t know the extent to which microbial and viral material has been included. This project aims to assess this through the analysis of several high-coverage, short read sequencing data derived from CTVT, DFT1 and DFT2 tumour samples, and from normal tissue obtained from their matched hosts.

1. Ní Leathlobhair M, Lenski RE. Population genetics of clonally transmissible cancers. Nat Ecol Evol. 2022;6: 1077–1089.
2. Murchison EP, Wedge DC, Alexandrov LB, Fu B, Martincorena I, Ning Z, et al. Transmissible [corrected] dog cancer genome reveals the origin and history of an ancient cell lineage. Science. 2014;343: 437–440.
3. Pye RJ, Pemberton D, Tovar C, Tubio JMC, Dun KA, Fox S, et al. A second transmissible cancer in Tasmanian devils. Proc Natl Acad Sci U S A. 2016;113: 374–379.
4. Hawkins CE, Baars C, Hesterman H, Hocking GJ, Jones ME, Lazenby B, et al. Emerging disease and population decline of an island endemic, the Tasmanian devil Sarcophilus harrisii. Biol Conserv. 2006;131: 307–324.
5. Metzger MJ, Villalba A, Carballal MJ, Iglesias D, Sherry J, Reinisch C, et al. Widespread transmission of independent cancer lineages within multiple bivalve species. Nature. 2016. Available: http://www.nature.com/nature/journal/vaop/ncurrent/full/nature18599.html
6. Gori K, Baez-Ortega A, Strakova A, Stammnitz MR, Wang J, Chan J, et al. Horizontal transfer of nuclear DNA in transmissible cancer. Proc Natl Acad Sci U S A. 2025;122: e2424634122.

Project Description

This project will investigate whether we can detect the inclusion of foreign DNA in the genomes of CTVT, DFT1 and DFT2. In the case of CTVT, we are already aware of a novel retrovirus that has implanted in the genome, and that papillomavirus is a common coinfection. The project will initially focus on detecting and characterising these viruses, before expanding to look for evidence of acquisition from any source.

Sequence that has inserted into the genome is of particular interest; a challenging part of the project will be to determine whether foreign DNA is integrated into the tumour genome, or is instead associated with the cellular environment. The project uses high-coverage, short read sequencing data derived from CTVT, DFT1 and DFT2 tumour samples, and from normal tissue obtained from their matched hosts. It will develop skills in bioinformatics, especially in handling genomic data, running sequence assembly, and comparing and annotating sequence contigs. Sequence analysis is an inherently mathematical topic. Assembling contigs is a longest common substring problem: by representing short sequence fragments as nodes in a graph in which the edges represent their overlaps, this amounts to finding the Euler path through the graph [7]. Sequence evolution models changes in sequence as a continuous time Markov process operating over a tree [8]. The large amount of data produced by sequencing projects necessitates use of statistics to simplify and identify patterns in the data. This project, though biological, has scope for the application and development of mathematical techniques, and would be of interest to a mathematician.

7. Pevzner PA, Tang H, Waterman MS. An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci U S A. 2001;98: 9748–9753.
8. Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17: 368–376.

References https://www.tcg.vet.cam.ac.uk/
Work Environment You will be working in Professor Elizabeth Murchison's Transmissible Cancer Group situated in the Department of Veterinary Medicine. You will work as part of a team, comprised of several undergraduate project students, PhD students, and post-doctoral researchers, any of whom can be approached to talk about and support your project. Remote work is possible, but physical attendance is preferred. We work standard, flexible office hours, with core hours of 10am-4pm.
Prerequisite Skills Statistics, Data Visualisation
Other skills used in the Project Probability / Markov Chains, Geometry / Topology, Simulation, Predictive Modelling
Acceptable Programming Languages Python, R, Or any other general purpose language
Additional Information Able to learn to use unfamiliar software, interest in biology or medicine, can communicate new perspectives
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Mechanochemical Feedback and Growth Dynamics in Rod-Like Plant Organs

Project Title Mechanochemical Feedback and Growth Dynamics in Rod-Like Plant Organs
Keywords Plant growth, morphogenesis, differential geometry, continuum mechanics, PDEs, numerical analysis
Project Listed 9 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Amir Porat
Contact Name Amir Porat
Contact Email ap2430@cam.ac.uk
Company/Lab/Department Sainsbury Laboratory, University of Cambridge
Address Sainsbury Laboratory, University of Cambridge, 47 Bateman Street, Cambridge, CB2 1LR
Project Duration 8 weeks, full-time
Project Open to Masters students (Part III), Third year undergraduates (Part II)
Background Information Plant morphogenesis results from the interaction of growth, mechanics, and genetic regulation. A key mathematical challenge in this area is to understand how spatially distributed biochemical morphogens give rise to observed growth patterns. This project focuses on plant organs with rod-like symmetry, such as shoots and roots, whose geometry allows their dynamics to be represented by the spatiotemporal evolution of a one-dimensional centerline embedded in three-dimensional space [1-3]. This reduction enables a mathematically tractable description of growth and bending driven by internal mechanical and biochemical processes.
Project Description

The biological motivation of the project is the study of the formation and maintenance of the apical hook, including its response to biochemical perturbations [4], as well as the gravitropic behavior of shoots and roots [5,6]. From a mathematical perspective, the project aims to develop and implement numerical methods for integrating continuum morphoelastic models of rod-like plant organs.

The project will focus on the design and implementation of a numerical scheme to integrate the model proposed in [7]. By assuming morphoelastic roles for morphogens, the model directly links axial growth and bending to spatial morphogen distributions within a quasi-static formulation of growth mechanics. The organ is represented as an array of concentric cylindrical morphoelastic shells connected in parallel, yielding a mathematically rich framework that couples elasticity, growth, and geometry.

To relate this continuum description to cellular processes, gene regulatory networks, and active morphogen transport, the model will be discretized and implemented within an existing custom numerical solver for plant morphodynamics written in C++ [8].

References [1] Porat, Amir, Fabio Tedone, Michele Palladino, Pierangelo Marcati, and Yasmine Meroz. "A general 3d model for growth dynamics of sensory-growth systems: from plants to robotics." Frontiers in Robotics and AI 7 (2020): 89.
[2] Moulton, Derek E., Hadrien Oliveri, and Alain Goriely. "Multiscale integration of environmental stimuli in plant tropism produces complex behaviors." Proceedings of the National Academy of Sciences 117, no. 51 (2020): 32226-32237.
[3] A. Goriely, The Mathematics and Mechanics of Biological Growth. Springer, 2017.
[4] Walia, Ankit, et al. "Differential Growth is an Emergent Property of Mechanochemical Feedback Mechanisms in Curved Plant Organs." Available at SSRN 4677553.
[5] K. Jonsson, Y. Ma, A.-L. Routier-Kierzkowska, and R. P. Bhalerao, “Multiple mechanisms behind plant bending,” Nature Plants, vol. 9, no. 1, pp. 13–21, 2023.
[6] Porat, Amir, Mathieu Rivière, and Yasmine Meroz. "A quantitative model for spatio-temporal dynamics of root gravitropism." Journal of Experimental Botany 75, no. 2 (2024): 620-630.
[7] Porat, Amir, Anne-Lise Routier-Kierzkowska, and Yasmine Meroz. "Understanding Shape and Residual Stress Dynamics in Rod-Like Plant Organs." bioRxiv (2025): 2025-08.
[8] https://gitlab.com/slcu/teamHJ/Organism
Work Environment

The student will be hosted within the Jönsson group, led by Professor Henrik Jönsson, Director of the Sainsbury Laboratory. The group offers a collaborative and intellectually stimulating research environment, with weekly group meetings in which members present and discuss ongoing research.

The student will receive regular one-on-one supervision through weekly meetings with Dr. Amir Porat, with additional guidance and support available as needed throughout the internship.

The Sainsbury Laboratory also provides a vibrant and welcoming environment for summer students, hosting a range of social events and activities designed to foster interaction among students from different programs and research backgrounds.

Prerequisite Skills Fluids, Partial Differential Equations, Geometry / Topology
Other skills used in the Project Simulation, Numerical Analysis, Mathematical Physics
Acceptable Programming Languages Python, C++
Additional Requirements Curiosity about plants and enthusiasm for applying mathematics to living systems.
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Cellularization and Perturbations of Self-Similar Plant Growth

Project Title Cellularization and Perturbations of Self-Similar Plant Growth
Keywords Plant growth, morphogenesis, differential geometry, Lie derivatives, self similarity, numerical modelling
Project Listed 9 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Amir Porat
Contact Name Amir Porat
Contact Email ap2430@cam.ac.uk
Company/Lab/Department Sainsbury Laboratory, University of Cambridge
Address Sainsbury Laboratory, University of Cambridge, 47 Bateman Street, Cambridge, CB2 1LR
Project Duration 8 weeks, full-time
Project Open to Masters students (Part III), Third year undergraduates (Part II)
Background Information

Plant morphogenesis arises from the interplay of growth, mechanics, and genetic regulation. A central mathematical challenge is understanding how spatially distributed biochemical morphogens generate stable growth patterns. This project focuses on plant organs exhibiting self-similar growth, where global geometry is preserved as cells grow and divide.

Such systems exhibit approximately stationary material flows in suitable coordinates, making them analytically tractable [1]. We have developed a novel continuous geometric framework for self-similar growing manifolds, in which growth is described by the Lie derivative of a Lagrangian metric [2,3]. Under the assumption of shearless growth along orthogonal curvilinear coordinates, the Lie derivative simplifies, providing a compact and powerful description of tissue deformation and growth kinematics.

Project Description

The project is split into two complementary tracks, with the option to focus primarily on one while exploring the other as time permits.

Track 1: 3D Cellularization of Self-Similar Growth (Numerical Modelling)
Extending an existing 2D C++ implementation of self-similar growth in the shoot apical meristem (SAM) [4] to three dimensions. This involves modelling cellular growth and division, developing algorithms to match simulated cell shapes and sizes to live-imaging data [5], and using the resulting geometries to analyse gene patterning and tissue mechanics.

Track 2: Geometric Perturbations of Self-Similar Growth (Analytical Modelling)
Investigating geometric perturbations of self-similar growth, motivated by phyllotaxis - the periodic formation of leaves and flowers on the flanks of the SAM [6]. Starting from the Lie derivative of the Lagrangian metric in self-similarity, we wish to examine how combined perturbations of the velocity and metric fields can generate biologically relevant growth patterns. The work begins with interpreting custom axisymmetric shearless perturbations of shells using the Lie derivative formalism, followed by periodic or more general perturbations, with the aim of establishing a new geometric framework for studying shape dynamics and morphogenesis.

References [1] Hejnowicz, Zygmunt. "Trajectories of principal directions of growth, natural coordinate system in growing plant organ." Acta Societatis Botanicorum Poloniae 53, no. 1 (1984): 29-42.
[2] Marsden, Jerrold E., and Thomas JR Hughes. Mathematical foundations of elasticity. Courier Corporation, 1994.
[3] Yano, Kentaro. The theory of Lie derivatives and its applications. Courier Dover Publications, 2020.
[4] https://gitlab.com/slcu/teamHJ/Organism
[5] Willis, Lisa, et al. "Cell size and growth regulation in the Arabidopsis thaliana apical stem cell niche." Proceedings of the National Academy of Sciences 113.51 (2016): E8238-E8246.
[6] Godin, Christophe, Christophe Golé, and Stéphane Douady. "Phyllotaxis as geometric canalization during plant development." Development 147, no. 19 (2020): dev165878.
Work Environment

The student will be hosted within the Jönsson group, led by Professor Henrik Jönsson, Director of the Sainsbury Laboratory. The group offers a collaborative and intellectually stimulating research environment, with weekly group meetings in which members present and discuss ongoing research.

The student will receive regular one-on-one supervision through weekly meetings with Dr. Amir Porat, with additional guidance and support available as needed throughout the internship.

The Sainsbury Laboratory also provides a vibrant and welcoming environment for summer students, hosting a range of social events and activities designed to foster interaction among students from different programs and research backgrounds.

Prerequisite Skills Fluids, Partial Differential Equations, Geometry / Topology
Other skills used in the Project Mathematical Physics, Numerical Analysis, Simulation
Acceptable Programming Languages Python, C++
Additional Requirements Curiosity about plants and enthusiasm for applying mathematics to living systems.
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Learning the Physics of Growth: Automated Discovery of Mechanistic Rules in Flower Development

Project Title Learning the Physics of Growth: Automated Discovery of Mechanistic Rules in Flower Development
Keywords graph neural networks, scientific machine learning, inverse problems, developmental biology, biophysics
Project Listed 9 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Argyris Zardilis
Contact Name Argyris Zardilis
Contact Email az401@cam.ac.uk
Company/Lab/Department Sainsbury Laboratory, University of Cambridge
Address 47 Bateman Street, Cambridge, CB2 1LR
Project Duration 8 weeks full-time, late June to September but flexible during the summer.
Project Open to Masters students (Part III), Third year undergraduates (Part II), Second year undergraduates (Part IB)
Background Information Flowers are among the most morphologically intricate structures in plants. Their development begins with a simple ball of undifferentiated cells, which gradually forms into a complex organ. These developmental events are driven by a combination of chemical signals, cellular growth and division, and tissue-level mechanical forces. While we can now capture this process using high-resolution 4D imaging and spatial transcriptomics, the sheer complexity of these multi-modal datasets makes “manual" modelling, where equations are derived by hand, increasingly challenging. This creates a fundamental Inverse Problem: given the observed developmental trajectories of an organ, can we "work backwards" to automatically identify the underlying physical and regulatory laws?
Project Description

Our general long-term aim is to develop an ‘inference engine’ using machine learning methods that extracts interpretable mechanistic laws directly from biological data to bridge the gap between modern large biological datasets and mechanistic biophysical models. This process includes two stages: (i) learning a ‘meaningful’ representation of the data (e.g. using unsupervised learning [1, 2]) and (ii) inferring connections and ultimately physical models from this learned representation [3, 4]. We recently demonstrated an example of this approach (LMRL25, ICLR workshop [5]). Depending on the student’s interests this project can focus on either or both of these complementary strands:

Strand 1: Data Alignment. Biological data is captured across different modalities (morphology vs. gene expression). This strand focuses on the "representation" problem: using unsupervised methods (e.g. Matrix Factorisation [1] or Autoencoders [2]) to project high-dimensional data into a unified, low-dimensional "latent manifold" that represents the tissue's state.

Strand 2: Differentiable Simulations & Model Inference. This strand focuses directly on the "inference" problem. Given a representation of the tissue state and its dynamics, we will use differentiable simulations (e.g. using Graph Neural Networks [6]) combined with identifiability techniques [3, 4] to infer possible governing equations.

References

[1] Argelaguet, Ricard, et al. "MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data." Genome biology 21 (2020).
[2] Yang, Karren Dai, et al. "Multi-domain translation between single-cell imaging and sequencing data using autoencoders." Nature communications 12.1 (2021).
[3] Maddu, Suryanarayana, et al. "Learning physically consistent differential equation models from data using group sparsity." Physical Review E 103.4 (2021).
[4] Brunton, Steven L., Joshua L. Proctor, and J. Nathan Kutz. "Discovering governing equations from data by sparse identification of nonlinear dynamical systems." Proceedings of the national academy of sciences 113.15 (2016).
[5] Zardilis, Argyris, Alexandra Budnikova, and Henrik Jönsson. "Learning a mechanical growth model of flower morphogenesis." LMRL25, ICLR workshop.
[6] Choi, Jeongwhan, et al. "Gread: Graph neural reaction-diffusion networks." International conference on machine learning. PMLR, 2023.
Refahi, Yassin, Zardilis, Argyris et al. "A multiscale analysis of early flower development in Arabidopsis provides an integrated view of molecular regulation and growth control." Developmental Cell 56.4 (2021).
Wang, Hanchen, et al. "Scientific discovery in the age of artificial intelligence." Nature 620.7972 (2023): 47-60.
Villoutreix, Paul. "What machine learning can do for developmental biology." Development 148.1 (2021): dev188474.

Work Environment The student will be embedded within the Jönsson group at the Sainsbury Laboratory (https://www.slcu.cam.ac.uk/research/research-group/jonsson-group) with supervision from both the group leader (Prof Henrik Jönsson) and day-to-day supervision from Dr Argyris Zardilis (postdoc). There is the opportunity for remote work but the student can take advantage of being part of the vibrant interdisciplinary community withing the laboratory.
Prerequisite Skills Statistics, Simulation, Data Visualisation
Other skills used in the Project Programming experience, Image processing
Acceptable Programming Languages Python, No preference
Additional Requirements Willing to work with real-world data and computational skills and a scientific curiosity to work in a biological setting.
Application Instructions *Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Reducing Skin-Tone Bias in Clinical Photoacoustic Imaging using Machine Learning and Computational Modelling

Project Title Reducing Skin-Tone Bias in Clinical Photoacoustic Imaging using Machine Learning and Computational Modelling
Keywords Medical Imaging, Machine Learning, Domain adaption, Bias correction, Computational Modelling
Project Listed 16 January 2025
Project Status Open
Application Deadline 27 February 2026
Project Supervisors Prof. Sarah Bohndiek and Dr Amy Zheng
Contact Name Dr Amy Zheng
Contact Email yz2003@cam.ac.uk
Company/Lab/Department VISIONLab, Cavendish Laboratory
Address Cavendish Laboratory, Department of Physics, JJ Thomson Ave, Cambridge CB3 0US
Project Duration 8 weeks full-time (exact timing to be co-ordinated with a successful candidate)
Project Open to Masters students (Part III), Third year undergraduates (Part II)
Background Information

Photoacoustic imaging is an emerging medical imaging technique that combines light and ultrasound to non-invasively measure blood oxygenation and vascular structure, providing valuable biomarkers for early cancer detection (e.g. breast cancer). However, a key challenge for its clinical use is systematic bias caused by skin pigmentation. Melanin in the skin absorbs light strongly, particularly in individuals with darker skin tones, which will reduce the amount of light reaching deeper tissue. This leads to underestimation of physiological parameters such as blood oxygenation and introduces bias that is unrelated to true underlying biology.

In our lab, we have previously undertaken large-scale healthy volunteer studies spanning a wide range of skin tones, generating unique clinical photoacoustic datasets that clearly demonstrate pigmentation-dependent bias in quantitative measurements. Current correction methods rely on simplified optical models or empirical calibration and often fail to generalise across diverse populations. Recent advances in machine learning and computational modelling therefore offer promising new approaches to explicitly model and correct pigmentation-induced bias by combining physics-based simulations with data-driven inference.

Project Description

The project is partly determinate and partly open-ended. The overall aim is to develop quantitative methods to reduce skin-tone-dependent bias in clinical photoacoustic imaging, building on large-scale datasets previously acquired in our lab.

Possible Project Directions:

  1. Exploratory analysis of simulated and clinical data to characterise how pigmentation-related bias varies with wavelength, tissue depth, and physiological parameters.
  2. Modelling the gap between simulations and clinical images using machine learning approaches, including conditional diffusion models, adversarial image-to-image translation methods (e.g. CycleGAN), and domain adaptation techniques to learn realistic noise, artefacts, and distribution shifts.
  3. Developing physics-informed AI methods for bias correction by incorporating physical constraints or simulation-based priors from optical and acoustic models building on existing approaches to improve quantitative accuracy across skin tones.
  4. Gaining experience with optical and acoustic computational modelling, including Monte Carlo light transport models and acoustic wave propagation (e.g. using in-house SIMPA framework), to understand the physical origins of bias in photoacoustic imaging.
References [1] Else TR, Loreno C, Groves A, Cox BT, Gröhl J, Modolell I, Bohndiek SE, Roshan A. The confounding effects of skin colour in photoacoustic imaging. medRxiv. 2025 Mar 30:2025-03.
[2] Gröhl, J., Dreher, K. K., Schellenberg, M., Rix, T., Holzwarth, N., Vieten, P., Ayala, L., Bohndiek, S. E., Seitel, A., & Maier-Hein, L. (2022). SIMPA: An open-source toolkit for simulation and image processing for photonics and acoustics. Journal of Biomedical Optics, 27(8), 083010.
Work Environment You will be based in the VISIONLab at the Cavendish Laboratory (https://www.bohndieklab.org/), with supervision from the group leader Prof. Sarah Bohndiek and day-to-day supervision from Dr. Amy Zheng (Postdoc). The group provides a collaborative research environment, with weekly meetings to present and discuss ongoing work. There is flexibility for remote working where appropriate, alongside in-person collaboration. The project also involves occasional travel to UCL to work closely with collaborators who are experts in ultrasound and photoacoustic imaging.
Prerequisite Skills Numerical Analysis, Image processing
Other skills used in the Project Mathematical Physics, Data Visualisation, Simulation, Machine Learning
Acceptable Programming Languages Python, MATLAB
Additional Requirements Interest in medical physics, computational modelling, and machine learning, and enthusiasm for applying quantitative methods to early cancer detection. Willingness to learn across disciplines, enjoy working with data and models within both physical and clinical aspects of the project.
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Learning the Gap Between Simulated and Experimental Spectral Data

Project Title Learning the Gap Between Simulated and Experimental Spectral Data
Keywords Domain Adaption, Spectral Analysis, Noise Modelling, Machine Learning
Project Listed 16 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Prof. Sarah Bohndiek and Dr Amy Zheng
Contact Name Dr Amy Zheng
Contact Email yz2003@cam.ac.uk
Company/Lab/Department VISIONLab, Cavendish Lab
Address Cavendish Laboratory, Physics Department, JJ Thomson Avenue, Cambridge, CB3 0US
Project Duration 8 weeks full-time
Project Open to Masters students (Part III), Third year undergraduates (Part II)
Background Information

Physics-based simulations are widely used in the development and interpretation of medical imaging and spectroscopic techniques, including Raman spectroscopy and photoacoustic imaging. Simulated data provide controlled access to underlying physical parameters and are commonly used for algorithm development and training machine-learning models. However, simulated spectrum often differ systematically from experimental measurements due to instrument response, measurement noise, calibration effects, and unmodelled experimental conditions.

This simulation-to-experiment gap limits the direct transfer of algorithms trained on simulated data to real clinical or experimental settings. In spectral and signal-domain measurements, these discrepancies can manifest as wavelength-dependent distortions, changes in noise structure, baseline shifts, and amplitude scaling, all of which are not fully captured by simplified physical models.

Recent advances in machine learning and data-driven modelling offer new opportunities to explicitly learn and characterise the gap between simulated and experimental spectral data. By leveraging paired or partially paired datasets, it becomes possible to model realistic noise processes, instrument-induced distortions, and distribution shifts, improving the fidelity of synthetic data and enabling more robust downstream analysis in real-world applications.

Project Description

The project is partly determinate and partly open-ended. The overall aim is to develop quantitative methods to model the gap between simulated and experimental spectral data, using paired Raman spectroscopy data and photoacoustic datasets available in our lab. While the problem setting is well defined, the appropriate level of model complexity is intentionally left open.

Possible Project Directions:

  1. Starting with simple, interpretable models to capture systematic differences between simulated and experimental data, such as additive or multiplicative noise models, wavelength-dependent distortions, baseline shifts, or low-rank spectral transformations.
  2. Exploring more flexible data-driven models, such as probabilistic regression models, conditional diffusion models, and adversarial learning approaches (e.g. GAN-based image or spectrum translation), to capture complex noise structures, nonlinear distortions, and distribution shifts between simulated and experimental data.
  3. Comparing different classes of models to understand trade-offs between interpretability, physical plausibility, and predictive performance.
  4. (Optional) Incorporating physical knowledge through constraints or simulation-based priors, ensuring that learned mappings remain consistent with known physical behaviour.
References [1] Else TR, Loreno C, Groves A, Cox BT, Gröhl J, Modolell I, Bohndiek SE, Roshan A. The confounding effects of skin colour in photoacoustic imaging. medRxiv. 2025 Mar 30:2025-03.
[2] Gröhl, J., Dreher, K. K., Schellenberg, M., Rix, T., Holzwarth, N., Vieten, P., Ayala, L., Bohndiek, S. E., Seitel, A., & Maier-Hein, L. (2022). SIMPA: An open-source toolkit for simulation and image processing for photonics and acoustics. Journal of Biomedical Optics, 27(8), 083010.
[3] Kamp, M., Surmacki, J., Segarra Mondejar, M. et al. Raman micro-spectroscopy reveals the spatial distribution of fumarate in cells and tissues. Nat Commun 15, 5386 (2024).
Work Environment You will be based in the VISIONLab at the Cavendish Laboratory, with supervision from the group leader Prof. Sarah Bohndiek and day-to-day supervision from Dr Amy Zheng (Postdoc). The group provides a collaborative research environment, with weekly meetings to present and discuss ongoing work. There is flexibility for remote working where appropriate, alongside regular in-person collaboration. The project will involve interaction with researchers across the Physics Department, as well as occasional visits to UCL to work with experts in ultrasound and photoacoustic imaging.
Prerequisite Skills Numerical Analysis, Algebra / Number theory
Other skills used in the Project Mathematical Physics, Image processing, Predictive Modelling
Acceptable Programming Languages Python, MATLAB
Additional Requirements Interest in computational modelling and machine learning, particularly in areas such as noise modelling and domain adaptation. Willingness to learn across disciplines. Enthusiasm in working with data and models with real experimental datasets.
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Can we use dynamical systems to predict what has happened in the past?

Project Title Can we use dynamical systems to predict what has happened in the past?
Keywords Mechanics, Dynamical systems, Biology, growth
Project Listed 16 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Euan Smithers
Contact Name Euan Smithers
Contact Email euan.smithers@slcu.cam.ac.uk
Company/Lab/Department Sainsbury Laboratory
Address Sainsbury Laboratory, Bateman Street, Cambridge CB2 1LR
Project Duration 8 weeks, full time
Project Open to Masters students (Part III), Third year undergraduates (Part II), Second year undergraduates (Part IB)
Background Information Plant tissues are built of cells that are rigidly connected, where they are joined together by edges and, importantly, junctions. The angle distribution of the edges meeting at a junction has been shown to change over time due to growth and junction tension. The steady-state of the junction angle distribution, however, could be affected by the growth direction. Currently, experimentalists have to perform time-consuming imaging at multiple time points and then link these images to determine the growth direction. Using a dynamical system of junction angle and edge tension, we wish to test whether you can predict the past growth direction from a single image at the last time point.
Project Description The student will be expected to apply knowledge of dynamical systems to real data, combined with image and statistical analysis, to test the method's validity. For this project, you will have access to actual data and experience working directly with experimentalists in an interdisciplinary environment.
References https://www.slcu.cam.ac.uk/research/robinson-group
Work Environment The student will work with the Robinson lab group as a team and will be primarily supervised by a post-doc, available to talk and provide support at any time. They will also have weekly meetings with the group leader. There are no strict hours, but the post-doc supervisor will be available during regular work hours. The student will get a desk and a computer at the Sainsbury laboratory so they can do the work. The Sainsbury Laboratory is a great work environment, with different sports groups and organised social events, including ones for just the summer students.
Prerequisite Skills Statistics, Mathematical Physics, Image processing
Other skills used in the Project Image processing, Mathematical Physics
Acceptable Programming Languages Python, MATLAB
Additional Requirements We seek enthusiastic students who are keen to apply their mathematical skills to real-life biological problems and to join us in this interdisciplinary environment. No prior knowledge of biology is required, but experience in coding will be looked upon favourably.
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Research in the Goldman group (EMBL-EBI): mathematical and computational methods to analyse genetic data

Project Title Research in the Goldman group (EMBL-EBI): mathematical and computational methods to analyse genetic data.
Keywords Genome data analysis, phylogenetics, algebraic geometry, sequencing technologies.
Project Listed 20 January 2026
Project Status Open
Application Deadline 27 February 2026
Project Supervisors Nick Goldman, Samuel Martin, Nicola de Maio, Isabel Poetzsch 
Contact Name Nick Goldman
Contact Email goldman@ebi.ac.uk
Company/Lab/Department EMBL-EBI, Goldman group
Address Wellcome Genome Campus, Hinxton, Cambridgeshire
Project Duration We are flexible to different needs.
Project Open to Masters students (Part III), Third year undergraduates (Part II), Second year undergraduates (Part IB)
Background Information The European Bioinformatics Institute (EMBL-EBI) is a world leading research and data science institute focusing on biological and biomedical sciences (https://www.ebi.ac.uk/about). The broad focus of the Goldman group at EMBL-EBI is the development of mathematical and computational approaches to analyse genetic data, and we can offer a number of projects to best fit the interests and skills of the students.
Project Description

One of our more specific areas of research is phylogenetic networks. These are directed graphs describing complex evolutionary histories. We use Markov models to probabilistically evaluate genome evolution histories along phylogenetic networks. We use algebraic statistics to understand these models, which are viewed as varieties from algebraic geometry, allowing us to efficiently infer phylogenetic networks from DNA data. Possible projects in this area could be to extend existing work [1] on inferring small phylogenetic networks, or developing further understanding of the geometry of small phylogenetic network models [2]. Experience in python programming and knowledge of algebraic geometry and statistical inference is desirable.

Another area of the group’s research is massive-scale phylogenetics. Progress in sequencing technologies enables the generation of datasets of thousands, and sometimes millions of pathogen genomes. This data can reveal essential details of infectious disease evolution and spread, but existing mathematical and computational methods for analysing such data are not scalable enough. We address this challenge by developing new scalable and accurate approaches [3,4]. Internships on this subject would focus on phylogeography, which uses genetic data to reconstruct transmissions between countries, species, or generally groups of hosts. Possible projects include contributing to the development of simulation methods (good coding skills in Python or C++ would be beneficial), benchmarking of existing methods (some coding skills would be useful), or the development of new efficient mathematical pathogen spread models robust to biases in sample collection (probabilistic mathematical modeling skills and some coding experience would be desirable).

Lastly, our group develops mathematical models and algorithms for the improvement of sequencing technologies - machines that read DNA data from biological samples. In particular, we use information theory principles to improve adaptive sampling methods, which focus sequencing resources on reading DNA fragments of interest: see [5]. This makes sequencing more efficient, allowing researchers to achieve more accurate genetic data at lower costs. Possible projects in this area would focus on simulations and benchmarking, determining scenarios in which adaptive sampling is useful. Skills in probabilistic mathematical modeling and some coding experience would be desirable.

References [1] Martin, S., Holtgrefe, N., Moulton, V., Leggett, R.M. Algebraic Invariants for Inferring 4-leaf Semi-directed Phylogenetic networks. Systematic Biology (2025) https://doi.org/10.1093/sysbio/syaf071.
[2] Gross, E., Krone, R. & Martin, S. Dimensions of Level-1 Group-Based Phylogenetic Networks. Bull Math Biol (2024) https://doi.org/10.1007/s11538-024-01314-z.
[3] De Maio, N., Kalaghatgi, P., Turakhia, Y. et al. Maximum Likelihood Pandemic-scale Phylogenetics. Nature Genetics (2023) https://doi.org/10.1038/s41588-023-01368-0.
[4] De Maio, N., Ly-Trong, N., Martin, S. et al. Assessing Phylogenetic Confidence at Pandemic Scales. Nature (2025) https://www.nature.com/articles/s41586-025-09567-x.
[5] Weilguny, L., De Maio, N., Munro, R., et al. Dynamic, Adaptive Sampling During Nanopore Sequencing Using Bayesian Experimental Design. Nature Biotechnology (2023) https://www.nature.com/articles/s41587-022-01580-z.
Work Environment The students will work directly with a member of the group (scientist, postdoc or PhD student) who will supervise them and collaborate with them on a daily basis (or less if preferred by the student). They will also have weekly meeting with the group leader and with the rest of the group to discuss updates and ongoing issues. We are flexible regarding working hours. We expect at least 3 days per week working from the office, but this can adjusted to meet individual circumstances.
Prerequisite Skills Probability / Markov Chains
Other skills used in the Project Statistics, Probability / Markov Chains, Geometry / Topology, Algebra / Number theory, Simulation, Predictive Modelling, Data Visualisation, App Building
Acceptable Programming Languages Python, C++
Additional Requirements Experience with HPCs might be useful in some projects in our group.
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

 

Quantitative Analysis of Host–Transmissible Cancer Interactions Using Genomic Data in Tasmanian Devils

Project Title Quantitative Analysis of Host–Transmissible Cancer Interactions Using Genomic Data in Tasmanian Devils
Keywords biology, bioinformatics, genomics, transmissible cancers, evolutionary dynamics
Project Listed 20 January 2025
Project Status Open
Application Deadline 27 February 2026
Project Supervisor Elizabeth Murchison and Sophia Belkhir
Contact Name Elizabeth Murchison
Contact Email epm27@cam.ac.uk
Company/Lab/Department Veterinary Medicine
Address Madingley Road, CB3 0ES
Project Duration 8 weeks, full time
Project Open to Masters students (Part III), Third year undergraduates (Part II)
Background Information

Most cancers die when the individual carrying them dies. In very rare cases, however, cancer cells can behave like parasites: they pass directly from one individual to another and continue living in new hosts. These are known as transmissible cancers.

Tasmanian devils, carnivorous marsupials living on the island of Tasmania, are affected by two transmissible cancers. These cancers spread when devils bite each other, transferring living cancer cells that grow into facial tumours. One of these cancers has already spread across almost the entire population and is threatening the survival of the species.

A key question is why these tumours are generally not recognized by the devils immune system and are some devils able to slow down or resist the disease.

Project Description

Tasmanian devils carry certain genes that produce proteins displayed on the surface of their cells. These proteins differ slightly between individuals and help the body distinguish its own cells from foreign cells. Recent work (Batley et al. 2025. bioRXiv) suggests that devils whose versions of these genes differ more strongly from those carried by the cancer cells are more likely to mount an immune response to the tumours, and overall have better outcomes. We aim to dig further on that question, using a larger dataset, and for that we will:

1. Characterize genetic variation
We will analyze DNA sequence data from hundreds of Tasmanian devils, focusing on a small group of highly variable immune genes shared between hosts and cancer cells. This will involve bioinformatics analysis of whole-genome short-read DNA sequencing data, complemented by long-read PacBio data available for a subset of individuals. We will identify genetic variants, including single nucleotide and structural variants, and link these results with gene expression data.
The goal is to reliably infer each individual’s gene variants from ambiguous sequencing data, using partial high-quality information. This will involve testing existing computational methods—or developing our own—to improve genotyping accuracy across the short-read dataset.

2. Link genes to disease outcomes
Using gene expression data, we will use statistical modelling to test whether devils carrying certain gene variants tend to:

  • show more tumour regression,
  • survive longer,
  • have more immune cells infiltrating the tumour.

3. Study cancer adaptation
If having such genes is helpful for the devils survival, we might expect to see the frequency of those variants increase in the population. We will test for that. On the other hand, if certain gene variants in devils make it harder for the cancer to survive, this creates an evolutionary arms race: as the host population evolves traits that improve resistance, the cancer cells may in turn evolve ways to avoid detection. We will examine cancer DNA and gene expression data to see whether some cancer lineages have lost or reduced the production of these surface proteins, and similarly, see how frequent that is.

Overall, this project offers hands-on experience with genomic and transcriptomic datasets, method development, and quantitative analysis in a setting where clear hypotheses will be tested against data, which directly connects to questions in cancer biology, epidemiology, and species conservation.

References Stammnitz et al. 2023. The evolution of two transmissible cancers in Tasmanian devils. Science 380, 283–293. https://doi.org/10.1126/science.abq6453
Caldwell, A., Siddle, H.V., 2017. The role of MHC genes in contagious cancer: the story of Tasmanian devils. Immunogenetics 69, 537–545. https://doi.org/10.1007/s00251-017-0991-9
Batley, K.C. et al. 2025. Immune recognition of transmissible cancers in Tasmanian devils with MHC-I deletion. https://doi.org/10.1101/2025.03.20.644438
Work Environment The student will be part of the Transmissible Cancer Group, a dynamic research lab at the department of Veterinary Medicine. They will be working with the direct support of a 3rd year PhD student (Sophia Belkhir), who will provide day-to-day advice and supervision, and will have weekly progress meetings with them and the PI of the group. The group mostly works in the office from 9:30am until 5:30pm, with some flexibility possible, and occasional work from home permitted, although we prefer being in-person for socialisation and better support, so that the student will also be able to count on the support of the rest of the group (3 postdocs, 2 other PhD students) if needed for specific questions. They will have the opportunity to attend internal seminars (on infectious disease modelling) are expected to participate in the weekly group meetings on Thursday afternoons. They will have access to the Sanger Institute High Performance Computing system, and resources and trainings that go with it.
Prerequisite Skills Statistics
Other skills used in the Project Statistics, Bioinformatics and genomics, Data Visualisation
Acceptable Programming Languages Python, R, Bash
Additional Requirements Enthusiastic and willing to learn about the biological aspects of transmissible cancers
Application Instructions Send your CV to the contact provided above along with a covering email which explains why you are interested in the project and why you think you would be a good fit.