2025 CMP Academic Projects

Academic CMP project proposals from summer 2025

Sainsbury Laboratory, University of Cambridge - Modelling Cellular Kinematics in Self-Similar Plant Growth
Keywords: Plant growth, morphogensis, differential geometry, self similarity, numerical modelling
Sainsbury Laboratory, University of Cambridge - Modelling the unknowns! How do plant cells develop, grow and communicate?
Keywords: Interdisciplinary, Plants, Scientific research, Biology, Mathematical modelling
Sainsbury Laboratory, University of Cambridge - Unsupervised learning for data integration and hypotheses generation in flower development
Keywords: unsupervised learning, machine learning, developmental biology, multi-modal data
VisionLab, Department of Physics - Supervised machine learning in digital pathology applied to the detection of disease in oesophageal lesions
Keywords: Image processing, supervised learning, textural analysis
EMBL-EBI / Goldman Group - Research in the Goldman group (EMBL-EBI): pandemic-scale phylogenetics and phylogenetic networks.
Keywords: Molecular evolution, phylogenetics, Markov models
Sainsbury Laboratory, University of Cambridge - Parameter inference from time-lapse images
Keywords: modelling, parameter inference, image, gradient descent
MRC-LMB & University of Cambridge - Virtual labelling for label free microscopy
Keywords: microscopy, biology, image processing, machine learning
MRC Cognition and Brain Sciences Unit - Shaping electrical stimulation in hearing implants
Keywords: Hearing, neuro-prosthetics, speech enhancement, neural decoding.
Institute of Astronomy - Nested Sampling for ARIMA Model Selection in Astronomical Time Series
Keywords: ARIMA, Nested Sampling, Time Series Analysis, Bayesian Model Selection, Astronomical Data
Institute of Astronomy - Machine Learning Enhanced Cosmological Tension Detection
Keywords: Cosmological tensions, Machine learning, Normalizing flows, Bayesian inference, Dark energy
Judge Business School, University of Cambridge - Real-Time Air Quality Forecasting for Proactive Policy Interventions
Keywords: Kalman filter, stochastic trend, Gompertz curve, PM2.5, Air Quality Index. Weather
Plus (plus.maths.org), Millennium Mathematics Project, DAMTP - Communicating mathematics - uncovering the story behind new research
Keywords: Communication, public engagement, mathematics, statistics, data science
MRC Cognition and Brain Sciences Unit - Tracking changes in human sensory performance over time
Keywords: Human behaviour, Perception, Random processes, Estimation theory, Optimization
Department of Physics - Understanding skin tone bias in photoacoustics
Keywords: photoacoustics, ultrasound, inverse problems, optimisation, reconstruction

Modelling Cellular Kinematics in Self-Similar Plant Growth

Project Title	Modelling Cellular Kinematics in Self-Similar Plant Growth
Keywords	Plant growth, morphogensis, differential geometry, self similarity, numerical modelling
Project Listed	8 January 2025
Project Status	Filled
Contact Name	Amir Porat
Contact Email	amir.porat@slcu.cam.ac.uk
Company/Lab/Department	Sainsbury Laboratory, University of Cambridge
Address	Sainsbury Laboratory, University of Cambridge, 47 Bateman Street, Cambridge, CB2 1LR
Project Duration	8 weeks, full-time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Plant morphogenesis arises from a complex interplay of growth, mechanics, and genetic regulation. A central challenge in plant development is understanding how biochemical morphogens shape growth patterns. This project addresses this question by focusing on plant organs that exhibit self-similar growth. These organs maintain stable geometries as their constituent cells grow, divide, and flow within the structure, offering an analytically tractable system for exploring growth kinematics.
Project Description	This project aims to model cellular proliferation—including growth and division—within self-similar growing organs. To achieve this, we will develop models using a custom numerical solver for plant morphodynamics, implemented in C++ [1]. The approach relies on a continuous framework of self-similar growing manifolds derived from differential geometry [2], which will be discretized into cellular units informed and validated by live imaging data. The study will begin with modelling axial growth in simple geometries, such as the rod-like structures of straight roots and shoots (in 1D and 3D). We will then focus on the apical hook, a rod-like organ which maintains a distinct macroscopic curvature [3,4]. If time permits, a similar approach will be applied to other organs, including the dome of the shoot apical meristem of Arabidopsis thaliana (3D) [5], and the notch of Marchantia (2D) [6]. By integrating these models into the custom solver, the project aims to unravel how morphogen patterning directly influences growth dynamics, offering predictive insights into plant morphogenesis. This involves solving reaction-diffusion equations on evolving cellular 3D geometries and coupling morphogen transport with the organ's elastic state.
Work Environment	The student will work within the Jönsson group, led by Professor Henrik Jönsson, Director of the Sainsbury Laboratory. The group provides a collaborative and dynamic environment, with weekly meetings where members discuss and present their research. Regular one-on-one supervision will be provided, with additional support available as needed.
References	[1] https://gitlab.com/slcu/teamHJ/Organism [2] A. Goriely, The Mathematics and Mechanics of Biological Growth. Springer, 2017. [2] K. Jonsson, Y. Ma, A.-L. Routier-Kierzkowska, and R. P. Bhalerao, “Multiple mechanisms behind plant bending,” Nature Plants, vol. 9, no. 1, pp. 13–21, 2023. [3] Walia, Ankit, et al. "Differential Growth is an Emergent Property of Mechanochemical Feedback Mechanisms in Curved Plant Organs." Available at SSRN 4677553. [4] Willis, Lisa, et al. "Cell size and growth regulation in the Arabidopsis thaliana apical stem cell niche." Proceedings of the National Academy of Sciences 113.51 (2016): E8238-E8246. [5] Bonfanti, Alessandra, et al. "Stiffness transitions in new walls post-cell division differ between Marchantia polymorpha gemmae and Arabidopsis thaliana leaves." Proceedings of the National Academy of Sciences 120.41 (2023): e2302985120.
Prerequisite Skills	Fluids, Geometry/Topology
Other Skills Used in the Project	Numerical Analysis, PDEs, Simulation, Predictive Modelling
Acceptable Programming Languages	Python, C++

Modelling the unknowns! How do plant cells develop, grow and communicate?

Project Title	Modelling the unknowns! How do plant cells develop, grow and communicate?
Keywords	Interdisciplinary, Plants, Scientific research, Biology, Mathematical modelling
Project Listed	13 January 2025
Project Status	Filled
Contact Name	Euan Smithers
Contact Email	euan.smithers@slcu.cam.ac.uk
Company/Lab/Department	Sainsbury Laboratory, University of Cambridge
Address	Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge, CB2 1LR
Project Duration	8 weeks between late June and September.
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Plants are fundamental to our world, and research into their mechanisms can allow us to develop more resilient crops, an essential need for the problems the future will bring. Plants also offer many exciting mysteries ripe for mathematical modelling and insight. Our goal is to understand how mechanics, cell division and chemical signalling interact to allow plants to develop and grow. Plant cells are rigidly connected, so where they decide to divide and grow needs to be heavily coordinated to determine their overall growth and tissue shape, which is of vital importance for plant functionality and efficiency. To investigate plant development, we apply experimental and modelling approaches.
Project Description	Join us for a project and you can help develop tools to understand plant development. There is flexibility in what you choose to do, but we apply a mix of image analysis, mechanical modelling and reaction-diffusion modelling. For this project, you will have access to actual data and experience working directly with experimentalists in an interdisciplinary environment. Some possible projects are, the consequences of cell and tissue topology/geometry on plant cell tissues, the effect of different cell setups/networks on cell-cell communication, and how mechanical stress can affect plant tissues/how plants sense mechanics.
Work Environment	The student will work with the Robinson lab group as a team and will be primarily supervised by a post-doc, available to talk and provide support at any time. They will also have weekly meetings with the group leader. There are no strict hours, but the post-doc supervisor will be available during regular work hours. The student will get a desk and a computer at the Sainsbury laboratory so they can do the work. The Sainsbury laboratory is a great work environment, with different sports groups and organised social events, including ones for just the summer students.
References	https://www.slcu.cam.ac.uk/research/robinson-group
Prerequisite Skills	Mathematical physics, PDEs, Simulation
Other Skills Used in the Project	Mathematical physics, Simulation, Predictive Modelling
Acceptable Programming Languages	Python, MATLAB, C++

Unsupervised learning for data integration and hypotheses generation in flower development

Project Title	Unsupervised learning for data integration and hypotheses generation in flower development
Keywords	unsupervised learning, machine learning, developmental biology, multi-modal data
Project Listed	13 January 2025
Project Status	Filled
Contact Name	Argyris Zardilis
Contact Email	argyris.zardilis@slcu.cam.ac.uk
Company/Lab/Department	Sainsbury Laboratory, University of Cambridge
Address	Sainsbury Laboratory, University of Cambridge, Bateman Street, Cambridge, CB2 1LR
Project Duration	8 weeks, late June to September but flexible during the summer.
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Flowers are among the most morphologically intricate structures in plants. Their development begins with a simple ball of undifferentiated cells, which gradually forms into a complex organ. One of the earliest visible stages in this process is the emergence of sepals, which fold over at the poles of the flower bud. These developmental events are driven by a combination of chemical signals, cellular growth and division, and tissue-level mechanical forces. In the group, we aim to understand how these different factors—chemical, mechanical, and morphological—work together to shape the developing flower. The data we collect during this process is multi-scale and multi-modal, including both molecular data (such as gene and protein/gene expression profiles) and time-series data capturing changes in tissue morphology. As these datasets are large and diverse, traditional methods for hypotheses generation become challenging. Computational methods can be used but not out of the box and as a community we have not converged on methods for these types of typical datasets. This hinders the extraction of understanding as our analysis methods are lacking behind our data generation capabilities.
Project Description	In this project, the student will work on the integration of the diverse datasets capturing developmental events during the early stages of flower formation. Such analyses are extremely challenging to perform manually due to the size and complexity of the datasets so the primary goal would be to use methods from unsupervised learning to automatically uncover relationships between the different biological signals in this process. The project will explore two powerful unsupervised learning approaches: - Factor Analysis: A statistical method to find common underlying patterns in multi-modal data across time [1]. - (Variational) Autoencoders: A neural network technique that learns compact, meaningful representations of data [2]. Depending on the results, we may extend the work to include physical models, using either traditional approaches or advanced learning techniques, such as directly discovering equations that describe the relationships between signals [3, 4].
Work Environment	The student will be embedded within the Jönsson group at the Sainsbury Laboratory (https://www.slcu.cam.ac.uk/research/research-group/jonsson-group) with supervision from both the group leader (Prof Henrik Jönsson) and day-to-day supervision from Dr Argyris Zardilis (postdoc). There is the opportunity for remote work but the student can take advantage of being part of the vibrant interdisciplinary community withing the laboratory.
References	[1] Argelaguet, Ricard, et al. "MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data." Genome biology 21 (2020): 1-17. [2] Yang, Karren Dai, et al. "Multi-domain translation between single-cell imaging and sequencing data using autoencoders." Nature communications 12.1 (2021): 31. [3] Rudy, Samuel H., et al. "Data-driven discovery of partial differential equations." Science advances 3.4 (2017): e1602614. [4] Maddu, Suryanarayana, et al. "Learning physically consistent differential equation models from data using group sparsity." Physical Review E 103.4 (2021): 042310. Refahi, Yassin, et al. "A multiscale analysis of early flower development in Arabidopsis provides an integrated view of molecular regulation and growth control." Developmental Cell 56.4 (2021): 540-556. Wang, Hanchen, et al. "Scientific discovery in the age of artificial intelligence." Nature 620.7972 (2023): 47-60. Villoutreix, Paul. "What machine learning can do for developmental biology." Development 148.1 (2021): dev188474. Hallou, Adrien, et al. "A computational pipeline for spatial mechano-transcriptomics." bioRxiv (2023): 2023-08.
Prerequisite Skills	Statistics, Data Visualization
Other Skills Used in the Project	Image processing, Simulation, Predictive Modelling
Acceptable Programming Languages	Python, No Preference

Supervised machine learning in digital pathology applied to the detection of disease in oesophageal lesions

Project Title	Supervised machine learning in digital pathology applied to the detection of disease in oesophageal lesions
Keywords	Image processing, supervised learning, textural analysis
Project Listed	20 January 2025
Project Status	Filled
Contact Name	Prof. Sarah Bohndiek
Contact Email	seb53@cam.ac.uk
Company/Lab/Department	VisionLab, Department of Physics
Address	Ray Dolby Centre, 19 J J Thomson Avenue. Cambridge CB3 0HE
Project Duration	Any 8-10 week period during the summer
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Histopathologists play a vital part in medicine by examining microscopic images of biopsy tissue and returning a diagnosis. Rapid, accurate diagnoses are critical to effective treatment of cancer patients. However, health resources are currently stretched, creating a reporting bottleneck. Histopathologists identify disease by visual inspection of the cellular structure in the image. Since the tissue images are 2-d numerical arrays of pixel values, where higher values represent increasing brightness, computational analysis of spatial patterns within the images could provide important metrics that help accelerate and substantiate disease diagnoses. This transformative new field of study, known as Digital Pathology (DP), offers a pathway to pattern-recognition beyond the capability of human visual inspection, impacting the speed, accuracy and data accessibility of diagnostic practice. The VisionLab team has a unique opportunity to explore the power of DP applied to early detection of oesophageal cancer, the deadliest cancer worldwide. Early detection of oesophageal cellular change is key to improving long term patient survival rates. We have built a rich DP data set of oesophageal tissue sections containing pre-malignant lesions. Each section image has regions of interest (ROI) labelled with a disease classification by a specialist pathologist. VisionLab is testing innovative optical and computational techniques against this data set to validate the diagnostic accuracy of those techniques.
Project Description	The project aim is: (i) to quantify the spatial distribution of cellular-scale features in our DP section images to identify differences between the various stages of cancer evolution (ii) to identify a subset of feature metrics that most effectively discriminates between each disease label. Your project will be to develop the computational pipeline that achieves these aims. You will familiarise yourself with the image data files and software tools available. You will review of academic literature and develop a rationale for computational methods to apply. You will segment the images to select those ROIs that are relevant to your research. You will spatially analyse image files to perform feature extraction and develop a model that identifies which subset of features most robustly fit with the ROI classes. You will divide your data into representative training, test and validation sets and evaluate your model performance. You will present your findings to the other members of the lab at the end of your project. You will write a project report that fulfils your course requirement and make your methods available to other VisionLab members.
Work Environment	You will be based at the state-of-art Ray Dolby Centre, opened this year on the West Cambridge site. Work will be supervised by a PhD student but you will have the independence to tailor your approach. Working remotely is also possible although you will need access to advanced computer hardware to efficiently process the large volumes of data.
References	https://www.mdpi.com/2072-6694/11/12/1937 https://www.mdpi.com/2072-6694/12/3/578 https://doi.org/10.1117/1.JMI.3.4.047502 https://www.nature.com/articles/s41598-019-54139-5
Prerequisite Skills
Other Skills Used in the Project	Image processing
Acceptable Programming Languages	Python, MATLAB

Research in the Goldman group (EMBL-EBI): pandemic-scale phylogenetics and phylogenetic networks.

Project Title	Research in the Goldman group (EMBL-EBI): pandemic-scale phylogenetics and phylogenetic networks.
Keywords	Molecular evolution, phylogenetics, Markov models
Project Listed	24 January 2025
Project Status	Open
Contact Name	Nicola de Maio, Samuel Martin
Contact Email	demaio@ebi.ac.uk; samuel.martin@ebi.ac.uk
Company/Lab/Department	EMBL-EBI / Goldman Group
Address	European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD United Kingdom
Project Duration	6-8 weeks, full-time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	The European Bioinformatics Institute (EMBL-EBI) is a world leading research and data science institute focusing on biological and biomedical sciences (https://www.ebi.ac.uk/about). The broad focus of the Goldman group at EMBL-EBI is molecular phylogenetics - the study of evolution through the analysis of molecular sequence data such as DNA. We develop mathematical models and computational tools to reconstruct evolutionary histories from sequence data. Typically, this history is described in terms of a phylogenetic tree — a tree graph in which the leaves of the tree represent extant species, and the internal vertices represent common ancestor species.
Project Description	The Goldman group can offer a number of projects to best fit the interests and skills of the students. One of the main areas of the group’s research is pandemic-scale phylogenetics. During the COVID-19 pandemic, low-cost DNA sequencing enabled the widespread sequencing of the SARS-CoV-2 genome, and we now have access to millions of these genomes. The challenge is to develop computational tools capable of analysing this large dataset in order to understand the virus’s evolutionary history. The group has developed the software MAPLE [1] for this purpose, which we continue to improve and expand. Possible projects include the development and testing of new features, such as leveraging meta-data (e.g. time and location of sequencing) to improve the accuracy of phylogenetic tree reconstruction, or identifying and correcting errors in genome data. Ability to code, particularly in Python, is required for this project, and experience in large data analysis will be considered a plus. Another area of research is phylogenetic networks — these are graphs describing the evolution of species, like phylogenetic trees, but are able to describe “horizontal” evolution events such as hybridisation. Here, we are interested in learning about Markov models of genome evolution placed on phylogenetic networks. One approach we are using is that from algebraic statistics, where models are viewed as varieties from algebraic geometry. We are interested in using this approach to develop fast methods of inferring phylogenetic networks from DNA sequence data. One possible project in this area would be to extend existing work [2] on inferring small phylogenetic networks to include parameter estimation, or developing further understanding of the geometry of small phylogenetic network models [3]. Experience in python programming and knowledge of computational algebraic geometry (e.g. affine/projective varieties, Gröbner bases) is desirable.
Work Environment	The student will be embedded in the Goldman group at EBI. They will be assigned a project partner (either Nicola de Maio or Samuel Martin, depending on the project), and have dedicated desk space in an office shared with other members of the group. As well as the group leader Nick Goldman, there are currently two PhD students, two postdocs, and one senior scientist in the group. Office hours are the usual 9-5. Some remote working may be possible.
References	[1] De Maio, N., Kalaghatgi, P., Turakhia, Y. et al. Maximum likelihood pandemic-scale phylogenetics. Nat Genet 55, 746–752 (2023). https://doi.org/10.1038/s41588-023-01368-0. [2] Martin, S., Moulton, V., Leggett, R.M. Algebraic Invariants for Inferring 4-leaf Semi-directed Phylogenetic networks. bioRxiv 2023.09.11.557152 (2023). doi: https://doi.org/10.1101/2023.09.11.557152. [3] Gross, E., Krone, R. & Martin, S. Dimensions of Level-1 Group-Based Phylogenetic Networks. Bull Math Biol 86, 90 (2024). https://doi.org/10.1007/s11538-024-01314-z.
Prerequisite Skills	Statistics, Probability/Markov Chains, Simulation, programming, command line usage
Other Skills Used in the Project
Acceptable Programming Languages	Python, C++

Parameter inference from time-lapse images

Project Title	Parameter inference from time-lapse images
Keywords	modelling, parameter inference, image, gradient descent
Project Listed	24 January 2025
Project Status	Filled
Contact Name	Elise Laruelle
Contact Email	elise.laruelle@slcu.cam.ac.uk
Company/Lab/Department	Sainsbury Laboratory, University of Cambridge
Address	47 Bateman Street, Cambridge, CB2 1LR
Project Duration	8 weeks full time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	During their lifetime, cells follow some programme to grow and divide, this is called the cell cycle. Every cell can follow a different programme depending on parameters as their function or their environment. For the plants, we suspect that the ambient temperature is one of these parameters as cell growth and division are perturbed in the leaves when plants grow in a higher temperature. Understanding how cells module this cycle is challenging due to the time that this process can take and the variability within the studied tissue. To track the phases of the cell cycle and have enough values to have a statistical significance, many time points will be needed. Moreover, with plant tissue, the light sensitivity generated by the imaging adds a stress and could possibly perturb the recorded process. In the context of the cell cycle study, to extract the duration of the different cell cycle phase, some compromises are made by the researcher that could be avoided by using modelling methods to infer the measure. This new analysis will allow to obtain faster results, to improve the statistical significance of the results and to extend the analysis. It will help us understand how plants respond to changing temperature.
Project Description	The current method to measure the length of the cell cycle phases requires to track the cells during several hours to obtain only a few numbers of observed length per experiment. The aim of this project is to infer biological parameters from these experiment microscopy images. The student will be in charge of improving a parameter estimation method based on a Markov chain model with a gradient descent algorithm. They will design the validation of the inference Then, several applications of the estimation method on data sets already processes could be realize by the student to answer the question: How long are the cell cycle phases ?
Work Environment	The student will be part of Sarah Robinson’s group and supervised by a computational biologist (postdoc). The main activity will be the method development (programming). The development of the method will be its own project, but it’s based on a bigger project of the group. To understand the data and the biology, the student will be able to discuss with the experimentalist (postdoc) developing the main project. The Sarah Robinson’s group is a multidisciplinary group, the student will exchange with biologists as well as a mathematician or a biochemist. The student will manage its working hours and its presence in the lab to archive its project. The student should take part of the group life and also the lab life.
References	Grewal, Jasleen K., Martin Krzywinski, and Naomi Altman. "Markov models—Markov chains." Nat. Methods 16.8 (2019): 663-664. https://doi.org/10.1038/s41592-019-0476-x Desvoyes, B., Arana-Echarri, A., Barea, M.D. et al. A comprehensive fluorescent sensor for spatiotemporal cell cycle analysis in Arabidopsis. Nat. Plants 6, 1330–1334 (2020). https://doi.org/10.1038/s41477-020-00770-4 Kumud Saini, Aditi Dwivedi, Aashish Ranjan, High temperature restricts cell division and leaf size by coordination of PIF4 and TCP4 transcription factors, Plant Physiology, Volume 190, Issue 4, December 2022, Pages 2380–2397, https://doi.org/10.1093/plphys/kiac345
Prerequisite Skills	Statistics, Probability/Markov Chains, Numerical Analysis
Other Skills Used in the Project	Image processing
Acceptable Programming Languages	Python

Virtual labelling for label free microscopy

Project Title	Virtual labelling for label free microscopy
Keywords	microscopy, biology, image processing, machine learning
Project Listed	27 January 2025
Project Status	Filled
Contact Name	Jerome Boulanger and Leila Muresan
Contact Email	jeromeb@mrc-lmb.cam.ac.uk
Company/Lab/Department	MRC-LMB & University of Cambridge
Address	Francis Crick Avenue, Cambridge CB2 0QH & Downing Site, Cambridge CB2 3DY
Project Duration	8 weeks, around 9:30am to 5pm
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Fluorescence microscopy enables the localisation of proteins of interest in cells, their distribution across various organelle, therefore providing an incredible insight into the biophysical mechanisms organizing the cell functions. These benefits come at the cost of phototoxicity and sample preparation. In the recent year, original approaches to localize organelles have been proposed and rely on gentler brightfield imaging modality and machine learning.
Project Description	This project will explore the advancement of virtual labelling techniques through the investigation of novel applications, such as larger 3D samples, and the development of advanced neural network architectures for image-to-image translation. This project offers a unique opportunity to contribute to the forefront of virtual labelling research while gaining invaluable experience in interdisciplinary collaboration.
Work Environment	It is a joint project between the MRC-LMB and the PDN/CAIC at the University of Cambridge. At the MRC-LMB you'll interact with the post docs in the team and across the institute. At PDN you'll interact with biologist and expert microscopists. You'll have the choice to share your time across these two locations.
References	V. A. Timonen et al., ‘DeepIFC: Virtual fluorescent labeling of blood cells in imaging flow cytometry data with deep learning’, Cytometry Part A, vol. 103, no. 10, pp. 807–817, 2023, doi: 10.1002/cyto.a.24770. B. Bai, X. Yang, Y. Li, Y. Zhang, N. Pillar, and A. Ozcan, ‘Deep learning-enabled virtual histological staining of biological samples’, Light Sci Appl, vol. 12, no. 1, p. 57, Mar. 2023, doi: 10.1038/s41377-023-01104-7. A. Somani et al., ‘Virtual labeling of mitochondria in living cells using correlative imaging and physics-guided deep learning’, Biomed Opt Express, vol. 13, no. 10, pp. 5495–5516, Sep. 2022, doi: 10.1364/BOE.464177. J. O. Cross-Zamirski, E. Mouchet, G. Williams, C.-B. Schönlieb, R. Turkki, and Y. Wang, ‘Label-free prediction of cell painting from brightfield images’, Sci Rep, vol. 12, no. 1, p. 10001, Jun. 2022, doi: 10.1038/s41598-022-12914-x. S. Cheng et al., ‘Single-cell cytometry via multiplexed fluorescence prediction by label-free reflectance microscopy’, Science Advances, vol. 7, no. 3, p. eabe0431, Jan. 2021, doi: 10.1126/sciadv.abe0431. F. Drawitsch, A. Karimi, K. M. Boergens, and M. Helmstaedter, ‘FluoEM, virtual labeling of axons in three-dimensional electron microscopy data for long-range connectomics’, eLife, vol. 7, p. e38976, Aug. 2018, doi: 10.7554/eLife.38976. E. M. Christiansen et al., ‘In Silico Labeling: Predicting Fluorescent Labels in Unlabeled Images’, Cell, vol. 173, no. 3, pp. 792-803.e19, Apr. 2018, doi: 10.1016/j.cell.2018.03.040.
Prerequisite Skills	Image processing,
Other Skills Used in the Project	Machine learning
Acceptable Programming Languages	Python

Shaping electrical stimulation in hearing implants

Project Title	Shaping electrical stimulation in hearing implants
Keywords	Hearing, neuro-prosthetics, speech enhancement, neural decoding.
Project Listed	28 January 2025
Project Status	Filled
Contact Name	Dorothée Arzounian
Contact Email	dorothee.arzounian@mrc-cbu.cam.ac.uk
Company/Lab/Department	MRC Cognition and Brain Sciences Unit
Address	15 Chaucer Road, Cambridge, CB2 7EF
Project Duration	Typically 8 weeks
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Some people with profound deafness receive so-called cochlear implants that restore a sensation of sound by stimulating the hearing nerve with electrical currents. The electrical stimulation of the implant bypasses the outer, middle and inner ear, and this shortcut represents both a challenge and an opportunity. It is a challenge because we need to understand the functioning of these organs and overcome technological constraints to recreate natural sound sensations. It is an opportunity for sensory neuroscience because it allows the exploration of our perception of neural excitation patterns of the auditory nerve that are impossible to generate acoustically.
Project Description	The mathematician will be given the opportunity to address different types of problems according to their personal interests. Major tools for addressing these problems are existing computational models that allow predicting the patterns of hearing nerve activation that are elicited by implant stimulation sequences, and the resulting perception in terms of speech sound recognition. The mathematician could for instance work on optimizing electrical stimulation patterns for various types of sound, and for speech in particular. Different questions may be addressed that relate to the best way of allocating sound frequency bands to individual stimulation channels, the best way to shape the profile of electrical currents in the cochlea (inner-ear organ) for each information channel, or the best way to code specific sounds (e.g. speech phonemes) as a multi-channel electrical sequence. Work from a previous CMP project looked into a method based on generative AI for exaggerating the differences between distinct phonemes (e.g. /d/ versus /g/, /t/ versus /k/, /a/ versus /e/, etc.) to make them easier to hear for implant users, and this work has several avenues for continuation. ,An alternative problem to address concerns the optimization of implant stimulation sequences for diagnostic purposes. Some physiological parameters differ between people but cannot be easily measured although they should be taken into account when fitting implant settings to individual patients. These parameters are however likely to be reflected in brain activity induced by implant stimulation and measureable using non-invasive or semi-invasive brain recording techniques. Using models that predict the neural responses and capture the effect of these physiological parameters of interest, we want to determine electrical stimulation patterns that generate the most informative neural responses regarding the unknown parameters.
Work Environment	The mathematician will get supervision from and regular meetings with Dorothée Arzounian and additional inputs from a co-supervisor, based on the preferred topic. They will also be invited to join the weekly meetings of our research group comprising 9 researchers. They will be hosted in the MRC Cognition and Brain Sciences Unit (on a quiet residential street off Trumpington Road, 1 mile from Cambridge city centre), with opportunities to interact with researchers working in different fields of neuroscience and cognitive science.
References	Cochlear implant demo: https://deephearinglab.mrc-cbu.cam.ac.uk/ci-fi/ Macherey, O., & Carlyon, R. P. (2014). Cochlear implants. Current Biology, 24(18), R878–R884. https://doi.org/10.1016/j.cub.2014.06.053 Brochier, T., Schlittenlacher, J., Roberts, I., Goehring, T., Jiang, C., Vickers, D., & Bance, M. (2022). From Microphone to Phoneme: An End-to-End Computational Neural Model for Predicting Speech Perception With Cochlear Implants. IEEE Transactions on Biomedical Engineering, 69(11), 3300–3312. IEEE Transactions on Biomedical Engineering. https://doi.org/10.1109/TBME.2022.3167113
Prerequisite Skills
Other Skills Used in the Project	Numerical Analysis, Simulation, https://deephearinglab.mrc-cbu.cam.ac.uk/ci-fi/
Acceptable Programming Languages	Python, MATLAB

Nested Sampling for ARIMA Model Selection in Astronomical Time Series

Project Title	Nested Sampling for ARIMA Model Selection in Astronomical Time Series
Keywords	ARIMA, Nested Sampling, Time Series Analysis, Bayesian Model Selection, Astronomical Data
Project Listed	29 January 2025
Project Status	Filled
Contact Name	Will Handley
Contact Email	wh260@cam.ac.uk
Company/Lab/Department	Institute of Astronomy
Address	Institute of Astronomy, University of Cambridge, Madingley Road, Cambridge, CB3 0HA
Project Duration	8 weeks
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Time series analysis is a crucial tool in many scientific fields, particularly in astronomy with the advent of time domain surveys. The Zwicky Transient Facility (ZTF) and the upcoming Vera Rubin Observatory's Legacy Survey of Space and Time (LSST) are set to produce unprecedented amounts of time series data, creating both opportunities and challenges for astronomers. ARIMA (Autoregressive Integrated Moving Average) models are a well-established method for analyzing and forecasting time series data. However, selecting the optimal ARIMA model for a given dataset can be challenging, as it involves choosing the right combination of autoregressive, integrated, and moving average terms. Nested sampling, a powerful Bayesian computational method, has shown great promise in model selection problems. Recent advancements in nested sampling implementations, such as the BlackJax library, have made it possible to leverage GPUs and high-performance CPUs for these computations. This project aims to combine the strengths of ARIMA modeling and nested sampling to develop a robust, efficient method for time series analysis and model selection, with a particular focus on astronomical applications. This approach has the potential to significantly improve our ability to analyze and interpret the vast amounts of time series data expected from current and future astronomical surveys.
Project Description	The student will develop a novel approach to time series analysis by combining ARIMA models with nested sampling for model selection. The project will involve the following key steps: Familiarization with ARIMA models and their implementation in Python. Understanding the principles of nested sampling and the BlackJax library. Developing a framework to generate a range of ARIMA models for a given time series. Implementing nested sampling to compute the Bayesian evidence for each ARIMA model. Creating a model selection procedure based on the computed evidences. Testing the method on simulated time series data. Applying the method to real astronomical time series from ZTF or similar surveys. Comparing the results with traditional ARIMA model selection techniques. The project is semi-open-ended, with the later stages depending on the success of the initial implementation. A successful outcome would be a working Python package that can perform ARIMA model selection using nested sampling, along with a demonstration of its effectiveness on both simulated and real astronomical data. The student will use their mathematical skills in several ways: Understanding and implementing time series models (ARIMA) Working with Bayesian statistics and model selection criteria Implementing and optimizing numerical algorithms (nested sampling) Analyzing and interpreting results from both simulated and real data Potentially developing new statistical metrics for model comparison This project will be particularly interesting and useful as it combines established statistical techniques with cutting-edge computational methods to address a pressing need in modern astronomy. The resulting tool could significantly enhance our ability to analyze the vast amounts of time series data expected from current and future astronomical surveys.
Work Environment	The student will be working as part of the Handley Lab at the Institute of Astronomy (IoA), which consists of PI Will Handley, 1 postdoc, and 8 PhD students. The lab collaborates closely with other research fellows in the Kavli Institute for Cosmology. https://handley-lab.co.uk/group/ The student will have ample support and opportunities for interaction. They will participate in: Weekly group meetings Weekly one-on-one meetings with the PI Regular journal clubs and seminars Group activities and social events In addition to the Handley Lab, the student will be part of the broader IoA summer student community, which includes around twenty other students working on various astronomy projects. This provides further opportunities for networking and collaboration. The student will be provided with a desk in a shared office space with other summer students at the IoA. They are expected to work primarily on-site, typically during standard office hours (9am-5pm, Monday-Friday), but there may be flexibility depending on individual circumstances and project needs. Day-to-day, the student can expect to: Work on their project independently Consult with the PI, postdocs, or PhD students when needed Attend relevant meetings and seminars Interact with other summer students Have access to necessary computational resources This environment offers a blend of independent research, mentorship, peer interaction, and exposure to cutting-edge astronomy research, providing an enriching summer experience for the student. Successful CMP projects have historically resulted in academic papers: https://arxiv.org/abs/2211.17248 https://arxiv.org/abs/2112.07547
References	https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average https://en.wikipedia.org/wiki/Nested_sampling_algorithm MacKay, D. J. C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press. https://www.inference.org.uk/itprnn/book.pdf Zwicky Transient Facility (ZTF) website: https://www.ztf.caltech.edu/ Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) website: https://rubinobservatory.org/
Prerequisite Skills	Statistics, Probability/Markov Chains, Numerical Analysis
Other Skills Used in the Project	Predictive Modelling, Data Visualization
Acceptable Programming Languages	Python

Machine Learning Enhanced Cosmological Tension Detection

Project Title	Machine Learning Enhanced Cosmological Tension Detection
Keywords	Cosmological tensions, Machine learning, Normalizing flows, Bayesian inference, Dark energy
Project Listed	30 January 2025
Project Status	Closed
Contact Name	Will Handley
Contact Email	wh260@cam.ac.uk
Company/Lab/Department	Institute of Astronomy
Address	Institute of Astronomy, University of Cambridge, Madingley Road, Cambridge CB3 0HA, Tel:+44 (0)1223 337548
Project Duration	8 weeks
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Tensions in cosmology, such as the Hubble tension and the weak lensing Ωm - σ8 tension, have been prominent in recent years. Previous work has developed methods to identify non-linear tension coordinates between cosmological datasets using neural networks. This project aims to build upon and update these techniques in light of recent advancements in both cosmological data and machine learning methods. Of particular interest is the recent controversial evidence for evolving dark energy from DESI (Dark Energy Spectroscopic Instrument) observations. This finding has sparked debate in the cosmological community and raises questions about potential systematic effects in supernova measurements. The project will incorporate the latest cosmological datasets from surveys such as DES, KiDS, HSC, ACT, SPT, DESI, and eROSITA. It will also explore the application of advanced machine learning techniques like normalizing flows and diffusion models to the problem of identifying and quantifying tensions in cosmological data. This work is interesting because it combines cutting-edge developments in both cosmology and machine learning to address fundamental questions about the nature of dark energy and the expansion of the universe. The results could provide new insights into the current tensions in cosmology and potentially guide future observational strategies.
Project Description	The student will work on extending and applying advanced machine learning techniques to identify and quantify tensions between cosmological datasets, building directly on existing data products and methods developed in the unimpeded project. The project will involve the following steps: Familiarization with existing framework: Understand the current implementation of piecewise normalizing flows and the margarine Python package used for accelerated Bayesian inference. Extend the machine learning models: Enhance the existing neural network models to incorporate the latest developments in normalizing flows and diffusion models. This will involve adapting the current piecewise approach to potentially improve its effectiveness in modeling complex cosmological probability densities. Apply to tension quantification: Use the enhanced models to identify non-linear tension coordinates across the grid of cosmological models produced by the unimpeded project. Focus particularly on tensions related to the Hubble constant, σ8, and ΩK, as well as the recent evidence for evolving dark energy from DESI. Comparative analysis: Compare the results of the enhanced machine learning approach with those obtained from traditional methods and the original piecewise normalizing flows. Investigate whether the new approach provides additional insights into the nature of the cosmological tensions. Optimization and scaling: Work on optimizing the implementation to handle the large-scale cosmological datasets efficiently, potentially extending the capabilities of the margarine package. Visualization and interpretation: Develop clear visualizations of the identified tension coordinates and prepare a report on the findings, with a particular focus on how the results contribute to our understanding of the evolving dark energy question. The student will use their mathematical skills in several ways: Advanced statistical analysis and Bayesian inference Machine learning theory, particularly focusing on normalizing flows and diffusion models Numerical methods and optimization for handling large-scale cosmological data Data analysis and interpretation in the context of cosmological tensions Scientific computing and algorithm development to enhance existing tools A successful outcome would be the development of an improved machine learning approach that provides new insights into the tensions between cosmological datasets, particularly regarding the evolving dark energy question. The work should contribute to the public library of machine learning emulators, enhancing its capabilities for parameter estimation, model comparison, and tension quantification in cosmology.
Work Environment	The student will be working as part of the Handley Lab at the Institute of Astronomy (IoA), which consists of PI Will Handley, 2 postdocs, and 9 PhD students. The lab collaborates closely with other research fellows in the Kavli Institute for Cosmology. https://handley-lab.co.uk/group/ The student will have ample support and opportunities for interaction. They will participate in: Weekly group meetings Weekly one-on-one meetings with the PI Regular journal clubs and seminars Group activities and social events In addition to the Handley Lab, the student will be part of the broader IoA summer student community, which includes around twenty other students working on various astronomy projects. This provides further opportunities for networking and collaboration. The student will be provided with a desk in a shared office space with other summer students at the IoA. They are expected to work primarily on-site, typically during standard office hours (9am-5pm, Monday-Friday), but there may be flexibility depending on individual circumstances and project needs. Day-to-day, the student can expect to: Work on their project independently Consult with the PI, postdocs, or PhD students when needed - Attend relevant meetings and seminars - Interact with other summer students Have access to necessary computational resources This environment offers a blend of independent research, mentorship, peer interaction, and exposure to cutting-edge astronomy research, providing an enriching summer experience for the student. Successful CMP projects have historically resulted in academic papers: https://arxiv.org/abs/2211.17248 https://arxiv.org/abs/2112.07547
References	1. Previous Summer & Part III Project write-up https://people.phy.cam.ac.uk/wh260/Galileo/Part_III_Projects/2020/Bayesian_information_theory_and_the_Hubble_constant_crisis/ 2. Handley, W. & Lemos, P. (2019). Quantifying dimensionality: Bayesian cosmological model complexities. Phys. Rev. D, 100, 023512. https://arxiv.org/abs/1903.06682 3. DESI Collaboration. (2024). DESI 2024 VI: Cosmological Constraints from the Measurements of Baryon Acoustic Oscillations. https://arxiv.org/abs/2404.03002 4. George Efstathiou (2023). Evolving Dark Energy or Supernovae Systematics? https://arxiv.org/abs/2408.07175 5. Papamakarios, G., et al. (2021). Normalizing Flows for Probabilistic Modeling and Inference. Journal of Machine Learning Research, 22(57), 1-64. https://jmlr.org/papers/v22/19-1028.html 6. Di Valentino, E., et al. (2021). In the realm of the Hubble tension—a review of solutions. Classical and Quantum Gravity, 38(15), 153001. https://arxiv.org/abs/2103.01183
Prerequisite Skills	Statistics, Probability/Markov Chains, Numerical Analysis
Other Skills Used in the Project	Simulation, Predictive Modelling, Data Visualization
Acceptable Programming Languages	Python

Real-Time Air Quality Forecasting for Proactive Policy Interventions

Project Title	Real-Time Air Quality Forecasting for Proactive Policy Interventions
Keywords	Kalman filter, stochastic trend, Gompertz curve, PM2.5, Air Quality Index. Weather
Project Listed	30 January 2025
Project Status	Filled
Contact Name	Paul Kattuman
Contact Email	p.kattuman@jbs.cam.ac.uk
Company/Lab/Department	Judge Business School, University of Cambridge
Address	Trumpington Street, Cambridge CB2 1AG
Project Duration	8 weeks between late June and September
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Severe air pollution surges increasingly threaten public health in major cities. For one example, in the National Capital Region, India, air quality deteriorates in winters to severely hazardous levels in multiple waves. Policymakers rely on the emergency mitigation measures which have so far been reactive and of very limited effectiveness. This project intends to apply a forecasting framework for air quality building on recent time series methods research. The key objective is to provide actionable forecasts that could shift policy from a reactive to a proactive mode.
Project Description	The technical aspects of the state-space modelling approach with Kalman filtering for air quality forecasting are based on adapting methods originally developed for epidemic forecasting and reported in Harvey and Kattuman (2020). The state-space framework, particularly the local linear trend applied to a transformation of a Gompertz curve function, is well-suited for dynamic forecasting of pollution waves. This helps in identifying dynamic short-term trends, allowing for meteorological and environmental factors and policy interventions. Integrating real-time adaptive filtering techniques that adjust estimates dynamically as new data becomes available will be helpful in addressing volatility in air quality.
Work Environment	Largely on their own, with close supervision by myself and a postdoc.
References	1. Kattuman, P., Harvey. A., Singh, V. (2024). 'Air Pollution in Delhi: Does GRAP Grip at all?' presented at Indian Statistical Institute, New Delhi. 2. Harvey. A., Kattuman, P. (2020). ‘Time series models based on growth curves with applications to forecasting coronavirus’, Harvard Data Science Review, Special issue 1— COVID-19. https://doi:10.1162/99608f92.828f40de 3. Harvey, A., Kattuman, P. and Thamotheram, C. (2021). ‘Tracking the mutant: Forecasting and nowcasting COVID-19 in the UK in 2021’, National Institute Economic Review, 256, pp. 110–26. https://doi.org/10.1017/nie.2021.12 4. Harvey, A. and Kattuman. P. (2021). ‘Farewell to R: time-series models for tracking and forecasting epidemics’, Journal of the Royal Society Interface, 18, 210179. https://doi.org/10.1098/rsif.2021.0179
Prerequisite Skills	Statistics, Time Series models
Other Skills Used in the Project	Predictive Modelling
Acceptable Programming Languages	R, R preferred.

Communicating mathematics - uncovering the story behind new research

Project Title	Communicating mathematics - uncovering the story behind new research
Keywords	Communication, public engagement, mathematics, statistics, data science
Project Listed	30 January 2025
Project Status	Filled
Contact Name	Marianne Freiberger, Rachel Thomas
Contact Email	mf344@cam.ac.uk, rgt24@cam.ac.uk
Company/Lab/Department	Plus (plus.maths.org), Millennium Mathematics Project, DAMTP
Address	Centre for Mathematical Sciences, University of Cambridge, Wilberforce Road, Cambridge
Project Duration	8 weeks from Monday June 30 to August 22
Project Open to	Undergraduates, Master's (Part III) students
Background Information	The mathematical sciences are becoming ever more visible as key tools in understanding the world we live in and addressing societal and individual challenges — from artificial intelligence and other advances in technology, to climate change and public and individual health. At the same time, mathematics remains one of the hardest fields to access for people who are outside the mathematics community. Many different audiences might want or need to engage with mathematics and related subjects: researchers from other fields, teachers and potential students, policy makers, the mainstream media, "users" of mathematics in industry and science, and the interested public. To enable these wider audiences to engage with the mathematical sciences, in particular with current research, translations are needed that are unbiased and clear, while retaining mathematical and scientific accuracy and rigour. Plus.maths.org provides a gateway to mathematics and related sciences, in particular to current research in these fields, for non-expert audiences through articles, podcasts and videos that are freely accessible on our website. The content ranges from basic explainers of particular concepts to in-depth explorations of particular areas or applications, and is produced by the Editors in direct collaboration with researchers. Plus.maths.org is part of the Millennium Mathematics Project (mmp.maths.org), directed by Professor Julia Gog. As well as providing communications expertise to the University of Cambridge's Maths Faculty, we also work directly with different research groups and organisations, including ongoing collaborations with the Isaac Newton Institute, the JUNIPER network and the Maths4DL research project. Previous collaborations have included the Stephen Hawking Centre for Theoretical Cosmology and Discovery+, the Cantab Capital Institute for the Mathematics of Information, the Cambridge Mathematics of Information in Healthcare Hub, among others.
Project Description	Plus.maths.org works directly with researchers through a number of collaborations. This might involve reporting on events or research papers from collaboration partners, or working with the partners to identify a potential topic to bring to life for a non-expert audience. The aim of the project would be to develop content to sit within our coverage of a particular collaboration, based on the intern's own mathematical interest and in collaboration with the plus.maths.org editors. This would involve: Identifying topics, concepts, and/or applications of the mathematics in question that are interesting and relevant for a non-expert audience Finding relevant content in our archive Producing one or more articles and/or podcasts to form part of the coverage. This could involve interviewing researchers and/or attending talks and conferences and reading research papers. Developing the resulting collection of content to be published on plus.maths.org. The project would also involve preparing content for publication on our website, promotion of content on our social media channels, as well as being involved in the routine maintenance of our site. The project would broaden the intern's mathematical horizon, give a view into the world of mathematical research, hone their communication skills, and familiarise them with all aspects of online publishing. The output would be articles and/or podcasts published under the intern's name that will remain available indefinitely on the plus.maths.org site. We would very much welcome the intern's own creative input, both in terms of the content itself and the way it could be promoted both on our site and on social media.
Work Environment	The intern will work closely with the Editors and be part of the wider MMP team. They can join other meetings both with collaborators and also with our web developer and web application developer. They would also have the opportunity to attend scientific meetings and conduct interviews with researchers as required. The intern would work hybridly, with 1-2 days a week in the CMS, and the rest of their time remotely, with regular communication with the plus.maths.org editors throughout the week.
References	Here are some examples of collections of content produced for particular INI programmes and events. The intern's role would be to help develop a similar collection and produce some of its content. Helping AI to learn some physics https://plus.maths.org/content/helping-ai-learn-some-physics This collection uses existing background material from our archive to build towards a new article and podcast exploring an important research area for our Maths4DL colleagues. This topic was identified through discussions during planning meetings with the Maths4DL team. And the article was based on a series of interviews with researchers. Moduli spaces: A journey through the hidden world of shapes https://plus.maths.org/content/moduli-spaces-journey-world-shapes-0 This collection contains a three-part article exploring a particular mathematical concept that was relevant to an Isaac Newton Institute research programme on algebraic and differential geometry. The article was based on a conversation with the organisers of the programme, who helped us identify the topic to be explored, explained the mathematics, and helped produce the article itself. The collection also contains relevant background material from our archive. Preparing for Disease X https://plus.maths.org/content/preparing-disease-x This article is based on a selection of talks from an event organised by our JUNIPER collaborators at the Isaac Newton Institute. As well as bringing together themes from the talks, the article draws extensively from research papers and ongoing conversations with the speakers.
Prerequisite Skills	Enthusiasm for communicating science to non-expert audiences in any format; Excellent written communication skills; Curiosity about different areas of mathematics
Other Skills Used in the Project	Writing skills, podcasting, social media, and online publishing skills
Acceptable Programming Languages	None Required

Tracking changes in human sensory performance over time

Project Title	Tracking changes in human sensory performance over time
Keywords	Human behaviour, Perception, Random processes, Estimation theory, Optimization
Project Listed	3 February 2025
Project Status	Closed
Contact Name	Dorothée Arzounian
Contact Email	dorothee.arzounian@mrc-cbu.cam.ac.uk
Company/Lab/Department	MRC Cognition and Brain Sciences Unit
Address	15 Chaucer Road, Cambridge CB2 7EF
Project Duration	Typically 8 weeks
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Psychophysics is the science that draws relationships between physical stimuli and perceptual experiences (for instance, between light wavelength and colour, between sound frequency and pitch, etc.) and mainly relies on behavioural experimentation where human participants give categorical responses to sensory stimuli. The statistics of the responses from large numbers of task trials then gives information about the participant's sensitivity to a stimulus parameter of interest. Sensitivity measures are important to quantify how we detect and differentiate stimuli, to determine the limits of human perception, understand individual and environmental factors that influence perception, and help in diagnosing sensory disorders (e.g. hearing loss, visual impairment) or neurological conditions. In some cases, it is possible that sensitivity changes over the course of an experiment or measurement procedure, for instance because the research participant or patient gets distracted, or because their sensitivity improves from training on the task. Detecting and quantifying such changes in performance would be helpful both to improve data interpretation and for the study of sensory training effects per se. For instance, when studying the effect of a treatment on hearing in a longitudinal drug study, it is important to know if changes in auditory performance between pre- and post-treatment phases could be due to task-learning occurring within each testing session. However, this is complicated by the statistical nature of sensitivity estimates: there is a trade-off between the accuracy of a time-dependent estimate and its temporal resolution, both differently affected by the number of data points (task trials) that are grouped in time to make a single estimate. Over the years, researchers have developed and improved methods to efficiently estimate sensory sensitivity using a limited number of task trials, under the assumption that sensitivity did not change over the course of the experiment. Most of these methods rely on adaptive procedures that optimize the values of stimulus parameters used in successive trials in order to collect responses for the values that yield the most information about sensitivity. We propose here to capitalize on these methods and to extend them in such a way as to develop a method for the continuous estimation of time-varying sensitivity.
Project Description	The objectives will be to (1) determine optimal ways to reconstruct the time-course of sensitivity from measureable experiment outcomes (time-series of stimulus-response pairs) and (2) identify the experimental procedures that favour reconstruction accuracy. The student may use existing mathematical models of human behaviour in sensory tasks to numerically simulate experiments with different adaptive procedures and procedure parameters. They may explore different theoretical scenarios where sensory sensitivity changes over time in a more or less auto-correlated fashion (occasional attention lapses, training-induced improvements in sensitivity). There will be amply opportunities for the mathematician to suggest their own original ideas for approaching the problem, especially concerning the first objective. The ultimate aim will be to produce and publish practical guidelines that researchers may use in experimental research to track changes in sensory sensitivity or other statistical measures of behaviour. An earlier programming experience and knowledge of basic statistics are the main prerequisites for this project. The mathematician may choose to work with their preferred programming language; however, some scripts for numerical simulations will be readily available in Matlab. A good intuition for probability and stochastic processes, and prior experience with Monte-Carlo simulations, numerical analysis, estimation theory, optimization and/or automation problems will be additional good assets but are not required.
Work Environment	The mathematician will get supervision from and regular meetings with Dorothée Arzounian. They will also be invited to join the weekly meetings of our research group comprising 9 researchers. They will be hosted in the MRC Cognition and Brain Sciences Unit (on a quiet residential street off Trumpington Road, 1 mile from Cambridge city centre), with opportunities to interact with researchers working in different fields of neuroscience and cognitive science. On-site working at typical hours is encouraged to make the most of available interactions, but occasional remote working is possible if the mathematician needs flexibility.
References	Doll, R. J., Veltink, P. H., & Buitenweg, J. R. (2015). Observation of time-dependent psychophysical functions and accounting for threshold drifts. Attention, Perception, & Psychophysics, 1-8. https://doi.org/10.3758/s13414-015-0865-x
Prerequisite Skills	Statistics, Programming
Other Skills Used in the Project	Probability/Markov Chains, Numerical Analysis, Mathematical Analysis, Simulation
Acceptable Programming Languages	Python, MATLAB, No Preference

Understanding skin tone bias in photoacoustics

Project Title	Understanding skin tone bias in photoacoustics
Keywords	photoacoustics, ultrasound, inverse problems, optimisation, reconstruction
Project Listed	3 February 2025
Project Status	Filled
Contact Name	Sarah Bohndiek
Contact Email	seb53@cam.ac.uk
Company/Lab/Department	Department of Physics
Address	JJ Thomson Avenue, Cambridge
Project Duration	Flexible but usually 8 weeks between late June and September.
Project Open to	Master's (Part III) students
Background Information	Ultrasound is emitted when a short enough pulse of light is absorbed within biological tissue, eg. when a 10 ns pulse of light is absorbed by hemoglobin in blood. In Photoacoustic Tomography (PAT), such ultrasound pulses are detected at the tissue surface and an inverse source problem is solved to form an image of the source distribution. Because of the dependence of light absorption on molecular content, PAT is a molecular imaging modality, and is currently being explored in clinical trials for diseases from cancer to inflammation. A problem arises when the skin absorbs some of the light as it travels into the tissue, as a planar ultrasound (US) pulse is generated that propagates into the tissue and scatters back to the detector array - just as in US imaging. Indeed, with strong enough skin absorption, these back-scattered signals can be used to form an US image (by making a single-scattering approximation). However, in the general case, these back-scattered US waves are not helpful and cause artifacts in the PAT image.
Project Description	The task in this project is to explore possible ways in which the PA and US signals can be separated to give both a PAT image, related to optical absorption, and an US reflection imaging, related to the tissue's acoustic properties. In other words, the project is concerned with the inverse problem of recovering two images from one set of data. In general, this inverse problem will be ill-posed - will not have a unique solution - so to solve it either additional data or some prior information will be required. Here are some possible directions for the project: Linear model. PAT and US imaging (under the Born approximation) can be modelled separately using linear operators to map from the tissue properties to the measured signals. These forward operators can be constructed as matrices for small-scale problems, eg. using the Matlab-based toolbox k-Wave (www.k-wave.org), and then combined to model our case. How ill-conditioned this matrix is - and so how possible it will be to form an image using it - can then be explored for a variety of different scenarios, eg. corresponding to things that can be changed experimentally, such as the optical wavelength or the shape of the skin surface. Joint-image reconstruction. Without the Born approximation, the US forward model becomes nonlinear. In this case we can attempt to recover the PAT image and the acoustic properties by minimising a loss (or objective) function, potentially incorporating a regularisation term. One direction this work could take could be a review of the relevant literature (there is relevant work on reflection-mode ultrasound tomography as well as joint PAT-sound speed recovery). Alternatively, a toy problem could be examined numerically to assess the ill-posedness, eg. by examining the Hessian of the loss function. Another idea is to regularise the inversion by segmenting the domain to reduce the number of unknowns. Finally, gradient-based methods, such as those popular in machine learning, could be used to try to solve the inversion.
Work Environment	Student will be part of a lab, co-supervised by Sarah Bohndiek in Cambridge Physics as well as Ben Cox, in UCL Medical Physics, London. They would be ideally work in person at least 3 days a week on average over the project, but we can make arrangements for remote working if preferred.
References	You can find out more about photoacoustic imaging and the skin tone bias problem through the following references: https://www.sciencedirect.com/special-issue/104ZMLK4X99 https://www.frontiersin.org/journals/physics/articles/10.3389/fphy.2022.1028258/full https://pmc.ncbi.nlm.nih.gov/articles/PMC10732256/
Prerequisite Skills	Statistics, Mathematical physics, Image processing, Simulation, Data Visualization
Acceptable Programming Languages	Python

Academic CMP project proposals from summer 2025

Modelling Cellular Kinematics in Self-Similar Plant Growth

Modelling the unknowns! How do plant cells develop, grow and communicate?

Unsupervised learning for data integration and hypotheses generation in flower development

Supervised machine learning in digital pathology applied to the detection of disease in oesophageal lesions

Research in the Goldman group (EMBL-EBI): pandemic-scale phylogenetics and phylogenetic networks.

Parameter inference from time-lapse images

Virtual labelling for label free microscopy

Shaping electrical stimulation in hearing implants

Nested Sampling for ARIMA Model Selection in Astronomical Time Series

Machine Learning Enhanced Cosmological Tension Detection

Real-Time Air Quality Forecasting for Proactive Policy Interventions

Communicating mathematics - uncovering the story behind new research

Tracking changes in human sensory performance over time

Understanding skin tone bias in photoacoustics

Forthcoming Seminars

News, Announcements and Events

Social media

Study at Cambridge

About the University

Research at Cambridge