## 2024 Summer Research in Maths (SRIM) projects

Below you will find a list of specific projects as well as general outlines for possible projects that have been submitted by Faculty members looking to recruit a summer research student. **New projects may be added throughout January and February so do check back!**

You can enquire about any projects you are interested in by writing to the contact given in the project listing below.

Applying for one of the projects listed on this page is not the only way of finding a research project in the CMS for summer 2024. Many supervisors do not advertise a specific project, but are happy to discuss possibilities with interested students. You can contact academics from this list of pre-approved supervisors in the subject areas you are interested in to enquire about a project with them this summer.

Once you have identified up to three projects / supervisors you should submit the __SRIM Expression of Interest__ form (Cambridge maths students) or __Internship Expression of Interest__ form (applicants from other departments and universities – please check you are eligible as an __external applicant__ before contacting supervisors about projects), by **23 February 2024**.

## Group Projects *New in 2024*

Mathematics research is an extremely collaborative subject and we believe that group projects where 2-3 students work together collaboratively on one problem more accurately reflects the experience you might have during a PhD in mathematics. Furthermore, feedback has shown that working together in a group is more enjoyable for summer students. In 2024 we are committed to offering the below group projects. Funding has already been set aside for these projects so a bursary application will not be necessary for the students selected.

**John Taylor (DAMTP, Ocean Dynamics Group) - Modelling ocean carbon solutions**

Keywords:*Fluid Dynamics, Simulation, Ocean, Biology, Climate***Rajen Shah (DPMMS / Statistical Laboratory) - Distribution Shift in Statistics and Machine Learning**

Keywords:*Statistics, Machine Learning***Henry Wilton (DPMMS) - Surfaces, graphs and commutators**

Keywords:*Algebra, Differential Geometry and Topology*

## Specific Projects

**Lloyd Fung (DAMTP / Eric Lauga's group) - Distinguishing mechanisms driving accumulation of stochastic swimmers near or away from surfaces**

Keywords:*Biofilm; Microswimmers; Diffusion; Fluid Dynamics; Mathematical Biology***Nuno Oliveira (DAMTP) - The evolution of bacterial chemotaxis towards antibiotics**

Keywords:*antibiotic resistance, bacterial motility, chemotaxis, microbial warfare***Fatemeh Rezaee (DPMMS) - Behrend function for higher dimensional schemes via elementary mathematics**

Keywords:*Behrend function, enumerative geometry, Euclidean geometry of parallelepipeds, monomial ideals, combinatorics***Nuno Miguel Oliveira and George Fortune (DAMTP) - Pushing dynamics of developing biofilms**

Keywords:*bacteria, biofilm, growth confinement, governing dynamics, pushing***Arieh Iserles (DAMTP) - Spectral methods for time-dependent PDEs**

Keywords:*Spectral methods, Orthogonal systems, Orthogonal polynomials***Richard Samworth (DPMMS / Stats Lab) - What is the statistical price for computational speed?**

Keywords**:***Statistical-computational trade-off, minimax lower bounds, low-degree polynomials***Richard Samworth (DPMMS / Stats Lab) - Missing data in high dimensions**

Keywords:*Missing data, multiple imputation***Angelica Aviles-Rivero (DAMTP / Cambridge Image Analysis)**-**PDEs Through Deep Learning**

Keywords:*PDEs, Deep Learning, Neural Operators, Physics-Informed Networks***Grigalius Taujanskas (DPMMS) - Asymptotic Decomposition of a Charged Scalar Field in de Sitter Space**

Keywords:*general relativity, asymptotic expansions, conformal method, nonlinear PDE***Ander Biguri (DAMTP / Cambridge Image Analysis) - Implementation of Multi-Scale Dense Networks for Computed Tomography Reconstruction**

Keywords:*Machine Learning, Tomographic Reconstruction, Inverse problems, Applications, Programming*

## General project outlines / possibilities

**Fatemeh Rezaee (DPMMS) - Tangent space to the Hilbert schemes of points****Fatemeh Rezaee (DPMMS) - Behrend function for higher dimensional monomial schemes****Adrien Lefauve (DAMTP / Environmental Fluid Dynamics) - How will estuarine mixing and salinity evolve under a changing climate?****Angelica Aviles-Rivero (DAMTP) - Single-Shot Unrolling Algorithm for Inverse Problem****Amelia Drew (DAMTP / Theoretical Cosmology) - Cosmic Strings and Boson Stars****Fatemeh Rezaee (DPMMS) - Algebraic geometry, combinatorics, and machine learning****Zak Shumaylov & Carola-Bibiane Schönlieb (DAMTP / Cambridge Image Analysis) - Deep neural networks in physics-based inverse imaging problems****Henry Moss (DAMTP / Institute of Computing for Climate Science) - Efficient Probabilistic Machine Learning using Newton's Identities****Marcelo Campos (DPMMS / Combinatorics) - Majority Dynamics in Graphs****Jan Petr (PhD student in DPMMS with Prof Bela Bollobas) - Upper bounds on third eigenvalues of adjacency matrices**

## Group Projects

## Modelling ocean carbon solutions

Project Title | Modelling ocean carbon solutions |

Keywords | Fluid Dynamics, Simulation, Ocean, Biology, Climate |

Project Listed | 22 January 2024 |

Contact Name | John Taylor |

Contact Email | jrt51@cam.ac.uk |

Company/Lab/Department | DAMTP, Ocean Dynamics Group |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information | There is a growing consensus that strategies will be needed to remove carbon dioxide from the atmosphere to avoid the worst consequences of climate change. Due to their very large area and capacity to sequester carbon for long timescales, the ocean has received attention as a possible venue for carbon dioxide removal (CDR). Various ocean CDR strategies have been proposed including approaches based on seawater chemistry (e.g. enhanced rock weathering) and biology (e.g. growing kelp forests). However, the ocean is a very dynamic fluid, and the interactions with the flow physics and the carbon and biological systems remain poorly understood and under-sampled. |

Project Description | This project would be for two students to work on a group project to help develop, test, and apply a computational tool to simulate the fluid dynamics and biological and chemical interactions at play in ocean CDR applications. This would use a code currently under development called OceanBioME (https://github.com/OceanBioME/OceanBioME.jl). OceanBioME uses the Julia proramming language and is designed to run very quickly on GPUs. Students could chose to focus more on code development, or using existing capabilities of OceanBioME appled to specific problems. Prior familiarity with the Julia programming language would be benficial, but is not required. Since this is a group project, it is essential that the student is willing to work collaboratively as part of a team. |

Work Environment | This is a group project, and students will work closely with PhD students and postdocs in DAMTP. The students will also have the opportunity to engage with other summer students working in the same general area through the Centre for Climate Repair. The team will be supervised by Prof. Taylor. |

References | https://github.com/OceanBioME/OceanBioME.jl |

Prerequisite Skills | Fluids, PDEs |

Other Skills Used in the Project | Fluids, Numerical Analysis, Simulation, Predictive Modelling, Data Visualization |

Acceptable Programming Languages | Julia |

## Distribution Shift in Statistics and Machine Learning

Project Title | Distribution Shift in Statistics and Machine Learning |

Keywords | Statistics, Machine Learning |

Project Listed | 23 January 2024 |

Contact Name | Rajen Shah |

Contact Email | rds37@cam.ac.uk |

Company/Lab/Department | DPMMS / Statistical Laboratory |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information | The assumption that data are i.i.d. is ubiquitous in Statistics, both in the design of methodology and their theoretical analysis. However, if for instance, data have been collected in different environments, this assumption may not always be satisfied. Instead, the distribution of training and test data may be very different, and as a consequence, for example in industrial applications the real-world performance of machine learning methods may be noticeably worse than expected. Similarly in the sciences, the failure that many results fail to replicate in follow-up experiments conducted in a different laboratory say, is widely recognised and often attributed to distributional shifts. This is an issue of genuine practical importance, and there is now a rapidly growing body literature in both Statistics and Machine Learning attempting to develop new methods and theory to tackle these problems. |

Project Description | A first step towards addressing the issues is to understand what features of the distributional change are responsible for results not replicating or degradation of performance. One possible aim of this project would be to develop and evaluate a method for testing whether the conditional distribution of the response Y, given the predictors X, is constant or not across datasets. This problem is related to the important problem of testing conditional independence, and methods for the latter may be re-purposed for testing for this form of potential distributional shift. Another possible direction would be to consider cases where the distribution of the predictors X is different among the train and test data, while the regression function itself is constant. Fitting to the training data alone may then be suboptimal, and it would be desirable to tailor the performance of a fitted model to the test dataset of interest. Indeed, such a strategy often underlies the strong performance of successful Kaggle entries. To address this in a more principled way, one could look at modifying the tree-splitting criterion for a decision tree-based learner, to better account for test data, for example. |

Work Environment | Group project for 2-3 students working together. |

References | Ying Jin, Kevin Guo and Dominik Rothenhäusler (2023). Diagnosing the role of observable distribution shift in scientific replications https://arxiv.org/abs/2309.01056 Dominik Rothenhäusler and Peter Bühlmann (2023). Distributionally robust and generalizable inference. Statistical Science. https://arxiv.org/abs/2209.09352 Lundborg, A. R., Kim, I., Shah, R. D. and Samworth, R. J. (2022) The Projected Covariance Measure for assumption-lean variable significance testing. https://arxiv.org/abs/2211.02039 Shah, R. D. and Peters, J. (2020) The hardness of conditional independence and the generalised covariance measure. Annals of Statistics, 48(3), 1514-1538. https://arxiv.org/abs/1804.07203 |

## Surfaces, graphs and commutators

Project Title | Surfaces, graphs and commutators |

Keywords | Algebra, Differential Geometry and Topology |

Project Listed | 7 February 2024 |

Contact Name | Henry Wilton |

Contact Email | hjrw2@cam.ac.uk |

Company/Lab/Department | Henry Wilton (DPMMS) |

Project Duration | 8 weeks |

Project Open to | Part II Mathematical Tripos / third year undergraduates |

Background Information |
A *commutator* is an element [a,b]:=aba^{-1}b^{-1} in a group G. The notation num(g) denotes the number of ways that an element g can be written as a commutator. (The full definition involves an equivalence relation, and is not completely obvious.) The purpose of this project is to investigate the behaviour of num(g) in the case of a non-abelian free group F. More precisely, we will investigate the maximum M(F)=max(num(g)) as g ranges over all non-trivial elements of F. The first result about M(F) is a 1981 theorem of Lyndon—Wicks, who proved that M(F) is at least 2. This remains the only known lower bound. The first upper bound comes from a general theorem of Sela from around 2000, which implies that M(F) is finite. However, the proof is non-constructive, and gives no finite upper bound. Beyond this, the state of the art is an unpublished result of Louder, who showed that M(F) <= ~10,000. |

Project Description |
The goal of this project is to investigate what’s known about M(F), and to try to improve the best known bounds. Although the definition of M(F) is algebraic, it can be rephrased in terms of maps from surfaces (specifically, a torus with one boundary component) to graphs. The techniques used will probably be combinatorial and geometric, involving drawing pictures of labelled graphs. Some experience with coding may also be useful in trying to improve Louder’s upper bound. |

Work Environment | This will be a group project, so there is room for several students to work on it, but they need to be willing to work in a group. Preference will be given to students who want to be present in Cambridge for the duration of the project. |

References |
Background reading on free groups and graphs: The state of the art on this problem: |

## Specific Projects

## Distinguishing mechanisms driving accumulation of stochastic swimmers near or away from surfaces

Project Title | Distinguishing mechanisms driving accumulation of stochastic swimmers near or away from surfaces |

Keywords | Biofilm; Microswimmers; Diffusion; Fluid Dynamics; Mathematical Biology |

Project Listed | 5 January 2024 |

Contact Name | Lloyd Fung |

Contact Email | lsf27@cam.ac.uk |

Company/Lab/Department | DAMTP / Eric Lauga's group |

Project Duration | 8 weeks |

Project Open to | Part II Mathematical Tripos / third year undergraduates |

Background Information | The phenomenon of motile microorganisms in fluids, such as bacteria, congregating close to (or away from) surfaces has been widely observed in many different contexts. The surface accumulation or depletion of these motile microorganisms, or microswimmers, is often attributed to drive biofouling and biofilm formation. Biofouling is a major issue in marine engineering, as they increase drag on ships and may even damage its structure. Meanwhile, biofilm is involved in a wide variety of microbial infections in the body. Their resistance to anti-microbial drugs, antiseptics or outright physical have always been a major challenge in healthcare. |

Project Description |
The phenomenon of motile microorganisms in fluids, such as bacteria, congregating close to (or away from) surfaces has been widely observed in many different contexts. The surface accumulation or depletion of these motile microorganisms, or microswimmers, is often attributed to drive biofouling and biofilm formation. While it seems to be a generic behaviour among many species, the specific processes underlying this behaviour can vary depending on the flow condition, type of surfaces and the microorganisms' swimming mechanism, speed and size. Mechanisms that can give rise to surface accumulation include, but are not limited to, biological taxes, far-field hydrodynamic interactions with the surface [1], coupling of stochastic swimming direction and flow [2], coupling between diffusion and swimming [3], and shape of the microswimmer and the wall [4]. Despite the distinct length scales and timescales associated with each process, a comprehensive study encompassing all these mechanisms and a definitive guideline for discerning between them based on observations are still lacking. In this project, student will be given an existing code (in JULIA) that simulate microswimmers in a long, narrow channel with a Poiseuille background flow. The first part of the project is to understand the code and replicate the result of [2] and [3]. We shall then study how varying parameters such as channel width, microswimmers' size, shape and stochasticity can give rise to or suppress different accumulation mechanism. Lastly, the student is expected to write up a short report on how to distinguish different mechanisms driving accumulation based on given parameters. This project is designed for a student with a strong background or interest in coding, numerics and stochastic processes. While prerequisite to JULIA is optional, the student embarking on this project is expected to learn the language and the Git workflow on the fly. Students who have pre-existing knowledge on Stochastic Ordinary Differential Equations (SDEs), Brownian motion and diffusion, and fluid dynamics are preferred. |

Work Environment | Student expected to work closely with the supervisor. Remote work is possible, but weekly meetings with the supervisor is expected to be physical. |

References |
1. Lauga, E. The Fluid Dynamics of Cell Motility |

Prerequisite Skills | Probability/Markov Chains, Numerical Analysis |

Other Skills Used in the Project | Fluids, Simulation, Data Visualization |

Acceptable Programming Languages | JULIA |

## The evolution of bacterial chemotaxis towards antibiotics

Project Title | The evolution of bacterial chemotaxis towards antibiotics |

Keywords | antibiotic resistance, bacterial motility, chemotaxis, microbial warfare |

Project Listed | 5 January 2024 |

Contact Name | Nuno Oliveira |

Contact Email | nmdso2@cam.ac.uk |

Company/Lab/Department | DAMTP |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information | In natural and clinical environments, bacteria are often exposed to gradients of antibiotics and other antimicrobials, but we still know little about how bacterial cells respond to such gradients. Until recently, we did not even know how bacteria control their motility in gradients of antibiotics, and cell motility is a major feature for bacterial survival and reproduction. We recently discovered that, unexpectedly, bacteria actively move towards lethal concentrations of antibiotics from different classes and modes of action, which suggested a general response (1). Moreover, we further found that the motile response we uncovered was consistent with positive chemotaxis towards the antibiotics themselves (as opposed to other emergent chemicals), and that a subpopulation of migrating bacteria produces their own antibiotics as they move along the gradient (1). These observations made us argue that bacterial chemotaxis towards antibiotics evolved as a response to antibiotic-producing species, where bacteria counterattack with their own toxins when they sense an incoming threat, akin to what we find in many animal societies. However, we still do not know how such perplexing behaviour contributes to the reproductive success of bacteria. Moreover, we have recently found by means of computer simulations and analytical theory that positive chemotaxis towards antibiotics is expected to reduce the adaptation rate of bacteria to antibiotics (2), which makes it even harder to understand how bacterial chemotaxis towards antibiotics evolved. To address this issue, we need to quantify the benefits and costs of such behaviour in direct competition with antibiotic-producing species, and this research project aims to provide such understanding. |

Project Description | The student is expected to build upon our current mathematical model of bacterial evolution in antibiotic gradients (2), and upon source-sink theory more generally (3,4). More precisely, the student will study if, and how, moving up an antibiotic landscape can be advantageous when this landscape is dynamic and results from antibiotic-producing competitors that grow in the neighbourhood of the motile genotype. Evolutionary game theory will be used to understand what the optimal motile strategy is to outcompete antibiotic-producing species. |

Work Environment | Students can work remotely or at DAMTP. Students set their own time-table and will be able to interact regularly with the supervisor responsible for the project and with students that worked on related projects before. |

References | (1) Oliveira NM, Wheeler JHR, Deroy C, Booth S, Walsh E, Durham WM, Foster KR (2023) Suicidal chemotaxis in bacteria. Nature Communications 13:7608. (2) Piskovsky V, Oliveira NM (2023) Bacterial motility can govern the dynamics of antibiotic resistance evolution. Nature Communications 14:5584. (3) Hermsen R, Hwa T. (2010) Sources and sinks: a stochastic model of evolution in heterogenous environments. Phys Rev Lett 105:248104. (4) Hermsen R, Deris J, Hwa T. (2012) On the rapidity of antibiotic resistance evolution facilitative by a concentration gradient. Proc Natl Acad Sci USA 109:10775-89. |

Prerequisite Skills | Probability/Markov Chains, PDEs, Simulation |

Other Skills Used in the Project | Game Theory |

Acceptable Programming Languages | Python, MATLAB, R |

## Behrend function for higher dimensional schemes via elementary mathematics

Project Title | Behrend function for higher dimensional schemes via elementary mathematics |

Keywords | Behrend function, enumerative geometry, Euclidean geometry of parallelepipeds, monomial ideals, combinatorics |

Project Listed | 5 January 2024 |

Contact Name | Fatemeh Rezaee |

Contact Email | fr414@cam.ac.uk |

Company/Lab/Department | DPMMS |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information | Behrend function (introduced by Kai Behrend) plays a crucial role in enumera tive geometry; in particular, Behrend function of zero-dimensional subschemes is crucial in counting invariants in enumerative theories with symmetric perfect obstructions theories. However, this function is infamous for having unpredictable behaviour; hence it has been very hard to compute it, even in low dimensions. I provided a novel, practical and elementary way to compute the Behrend function of zero-dimensional ideals in any number of variables; the story becomes more interesting when it comes to monomial ideals, which comes with a beautiful intuition. Reversely, we apply this intuition to present some interesting combinatorial results. |

Project Description | The aim is to generalise the result for the zero-dimensional case as above to higher dimensional schemes given by a monomial ideal via purely combinatorial and novel elementary methods. I’m looking for a strong and enthusiastic student to work this out together. This will give a powerful and novel tool to compute such a subtle function, which has application in enumerative geometry for specific invariants (which in turn have applications in string theory and physics) that cannot be calculated with any other tools. |

Work Environment | Flexible |

References |
Papers below are not required as we use elementary methods (books will be useful though), but I just provide some references just in case the student wants to have a look at some texts in the literature to motivate what a subtle object we propose to compute via elementary tools (4-7). The first three are books for commutative algebra and toric geometry background. Others are more advanced papers (the last two are more relevant). 1. M.F. Atiyah, .G.MacDonald,Introduction to commutative algebra,Addison-Wesley Publishing Company, 1969. |

Prerequisite Skills | Geometry/Topology, Algebra/Number Theory |

Other Skills Used in the Project | Combinatorics, commutative algebra, toric geometry |

Acceptable Programming Languages | None Required |

## Pushing dynamics of developing biofilms

Project Title | Pushing dynamics of developing biofilms |

Keywords | bacteria, biofilm, growth confinment, governing dynamics, pushing |

Project Listed | 8 January 2024 |

Contact Name | Nuno Miguel Oliveira |

Contact Email | nmdso2@cam.ac.uk |

Company/Lab/Department | DAMTP |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates, Part II Mathematical Tripos / third year undergraduates |

Background Information |
Bacteria typically form surface-associated communities (biofilms) that are critical for microbial biology and their impact on us, including the ability to promote health or disease in animals and plants. While biofilm research has been ever increasing, most studies on biofilm growth have focused on unconfined conditions, particularly on the expansion of bacterial colonies above solid or semi-solid surfaces (e.g., 1, 2). In natural and clinical conditions, however, bacteria are often exposed to spatial confinement such as the vascular systems of their hosts, and we still know little about the governing dynamics of biofilm growth under confinement, namely the interplay between radial expansion and vertical pushing. We have recently developed an experimental and theoretical framework to study biofilm growth under elastic confinement, which considers one of the simplest confined geometries, a bacterial biofilm growing in the space between a solid surface and an overlying elastic sheet (3). In particular, we considered a linear poroelastic model that led to a governing equation for the radius of the biofilm, whose dynamics matches our experimental data of biofilms growing monotonically outwards. More recently, we developed an experimental and image analysis pipeline that allowed us to correlate the radial (lateral) expansion with the pushing (vertical) dynamics of confined biofilms, and found that these two components are intrinsically linked, one determining the other (4), which further validates our analytical model that had predicted this dependence. This research project aims to expand on this work to further our understanding of the interplay between radial expansion and vertical pushing of biofilm growth under confined conditions where growth, namely under conditions where growth is not monotonic as we have considered so far but oscillatory (5). |

Project Description | Important tasks of the project include, setting up image processing scripts in MATLAB to extract quantitative data from experimental videos by building on already written code and algorithms; and develop computational and analytical theory that can capture the dynamics observed experimentally and expand on our current analytical model. The project is largely open-ended and later parts depend on what is found initially and on the specific interests of the student. |

Work Environment | The student will work with two supervisors (Nuno M. Oliveira and George Fortune), and will have the opportunity to interact with the student who developed our current image analysis pipeline to understand biofilm pushing dynamics (Twana Cheragwandi). The student will set their own schedule, and can work in DAMTP or remotely, as they prefer. |

References | (1) Semina A, et al. (2012) Osmotic Spreading of Bacillus subtilis biofilms driven by an extracellular matrix. PNAS 109 (4):1116-21. (2) Yan J, et al. (2017) Extracellular-matrix-mediated osmotic pressure drives Vibrio cholerae biofilm expansion and cheater exclusion. Nature Communications 8:327. (3) Fortune G, Oliveira NM, Goldstein RE (2022) Biofilm growth under elastic confinement. Phys Rev Letters 128:178102. (4) Cheragwandi T, Fortune G, Oliveira NM. Pushing Matters: Growth Dynamics of Confined Biofilms. In prep. (5) Liu J, et al. (2015) Metabolic co-dependence gives rise to collective oscillations within biofilms. Nature 523:55-54. |

Prerequisite Skills | Image processing, basic programming in Matlab |

Other Skills Used in the Project | Fluids |

Acceptable Programming Languages | MATLAB |

## Spectral methods for time-dependent PDEs

Project Title | Spectral methods for time-dependent PDEs |

Keywords | Spectral methods, Orthogonal systems, Orthogonal polynomials |

Project Listed | 9 January 2024 |

Contact Name | Arieh Iserles |

Contact Email | ai10@cam.ac.uk |

Company/Lab/Department | DAMTP |

Project Duration | 6 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information | Covers very recent developments on the numerical analysis of time-dependent PDEs |

Project Description |
While spectral methods for time-dependent PDEs have been around for a long time, only recently we are gaining broader understanding on their design and analysis. This will be the subject matter of this project. A spectral method starts by specifying an orthonormal basis of the underlying Hilbert space – once this is done, we can convert a time-dependent PDE into an infinite-dimensional ODE system by Galerkin conditions and solve its finite-dimensional truncation. Thus, everything starts with an orthonormal basis, which need be selected while balancing a number of criteria: The project will survey a number of contemporary approaches in the choice of orthonormal bases, in particular T-systems and W-systems. Otherwise it is fairly open ended and might pursue a number of possible directions, depending on the interests of the student. It might be theoretical or computational in flavour and might explore some open problems, in particular applications to the computation of dispersive equations of quantum mechanics. |

Work Environment | On their own + my mentoring |

References | [1] Jan Hesthaven, David Gottlieb & Sigal Gottlieb, Spectral Methods for Time-Dependent Problems, Cambridge University Press (2007). [2] Arieh Iserles & Marcus Webb, "Orthogonal systems with a skew-symmetric differentiation matrix, Found. Comp. Maths 19 (2019), 1191–1221. [3] Arieh Iserles & Marcus Webb, "Fast computation of orthogonal systems with a skew-symmetric differentiation matrix", Comm. Pure & Appld Maths 74 (2021), 478–506. [4] Arieh Iserles, "Stable spectral methods for time-dependent problems and the preservation of structure" Found. Comp. Maths (2024), to appear (copy available from Prof. Iserles). |

Prerequisite Skills | Numerical Analysis, Mathematical Analysis |

Other Skills Used in the Project | Mathematical physics |

Acceptable Programming Languages | No Preference |

## What is the statistical price for computational speed?

Project Title | What is the statistical price for computational speed? |

Keywords | Statistical-computational trade-off, minimax lower bounds, low-degree polynomials |

Project Listed | 10 January 2024 |

Contact Name | Richard Samworth |

Contact Email | rjs57@cam.ac.uk |

Company/Lab/Department | DPMMS / Stats Lab |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information | Minimax lower bounds provide fundamental limits on how well a signal can be estimated from noisy observations. When this bound is matched by a specific estimation procedure, we have a quantification of the difficulty of the particular problem at hand. Unfortunately, when the signal to be estimated is high-dimensional and may be sparse --- as is common in modern settings --- estimation procedures that match the minimax lower bound may require untenable amounts of computation. Over the last decade, a body of evidence has been collected to indicate that several estimation problems seem to exhibit a fascinating statistical-computational gap in which there is a distinction in the minimax rate of estimation and the computationally-feasible rate of estimation. |

Project Description | This project aims to develop tools to obtain computationally-constrained minimax lower bounds. In particular, we aim to develop analogues of classical methods to obtain minimax lower bounds, such as Fano's or Assouad's method, when restricted to estimators that are low degree polynomials of the observed data. |

Work Environment | The project will be jointly supervised with my post-doc Kabir Verchand and PhD student Manuel Mueller. The student will be part of my research group, which meets on Monday morning in person, and as part of an extended group meeting on Tuesday afternoons. They may wish to participate in the Statistics Clinic (which happens around once a month out of term). The four of us would meet weekly to discuss the project. Ideally the student would work in the CMS. The project is more suitable for a finishing third-year undergraduate, but may be accessible to an exceptional finishing second-year. |

References | Samworth and Shah (2024+) Modern Statistical Methods and Theory, Chapter 9 (available from the setter) Wainwright (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint, Chapter 15. Cambridge University Press Schramm and Wein (2022) Computational Barriers to Estimation from Low-Degree Polynomials, Annals of Statistics, 50, 1833-1858. Auddy and Yuan (2023) Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability. arXiv preprint arXiv:2303.18156. Luo and Gao (2023) Computational lower bounds for graphon estimation via low-degree polynomials. arXiv preprint arXiv:2308.15728, 2023. |

Prerequisite Skills | Statistics, Mathematical Analysis |

Other Skills Used in the Project | |

Acceptable Programming Languages | Python, R |

## Missing data in high dimensions

Project Title | Missing data in high dimensions |

Keywords | Missing data, multiple imputation |

Project Listed | 10 January 2024 |

Contact Name | Richard Samworth |

Contact Email | rjs57@cam.ac.uk |

Company/Lab/Department | Richard Samworth (DPMMS / Stats Lab) |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information | When faced with missing data, a popular strategy consists of first filling-in, or imputing, the missing entries, and subsequently using a method designed for complete data. Multiple imputation, broadly construed, consists of repeating the aforementioned procedure several times and aggregating the results. In the classical regime in which the dimension of the data is fixed and the number of observations is taken to infinity, the behaviour of several multiple imputation procedures is well understood. Unfortunately, these typically require an initial, consistent estimator, such as a complete-case estimate, which may be unavailable when the dimension of the data is large as compared to the number of observations. |

Project Description | This project will study multiple imputation in the high-dimensional setting. In particular, we adopt a simplistic assumption on the distribution of the covariates and the mechanism by which data is missing, and aim to provide sharp guarantees for the estimation error of multiple imputation. To provide these sharp estimates, we will employ tools such as Gordon's convex Gaussian minimax theorem (CGMT), a leave one out method due to El Karoui, and universality properties of high-dimensional M-estimation procedures. |

Work Environment | The project will be jointly supervised with my post-doc Kabir Verchand. The student will be part of my research group, which meets on Monday morning in person, and as part of an extended group meeting on Tuesday afternoons. They may wish to participate in the Statistics Clinic (which happens around once a month out of term). The three of us would meet weekly to discuss the project. Ideally the student would work in the CMS. The project is more suitable for a finishing third-year undergraduate, but may be accessible to an exceptional finishing second-year. |

References | van der Vaart (1998) Asymptotic Statistics, Chapter 25. Cambridge University Press. Thrampoulidis, Oymak, Hassibi (2015) Regularized linear regression: A precise analysis of the estimation error. Conference on Learning Theory. El Karoui, Bean, Bickel, Lim and Yu (2013) On robust regression with high-dimensional predictors. PNAS, 110.36, 14557-14562. Wang and Robins (1998) Large-sample theory for parametric multiple imputation procedures. Biometrika, 85, 935-948. Han and Shen (2023) Universality of regularized regression estimators in high dimensions. Annals of Statistics, 51, 1799-1823. |

Prerequisite Skills | Statistics |

Other Skills Used in the Project | Simulation |

Acceptable Programming Languages | Python, R |

## PDEs Through Deep Learning

Project Title | PDEs Through Deep Learning |

Keywords | PDEs, Deep Learning, Neural Operators, Physics-Informed Networks |

Project Listed | 12 January 2024 |

Contact Name | Angelica Aviles-Rivero |

Contact Email | ai323@cam.ac.uk |

Company/Lab/Department | DAMTP / Cambridge Image Analysis |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information |
The realm of numerical partial differential equation (PDE) solutions remains a perpetual focal point in the domain of numerical analysis, brimming with mathematical intricacies. This is most pronounced when tackling nonlinear PDEs, PDEs residing in high-dimensional spaces, and PDEs defined over intricate domains - scenarios frequently encountered in real-world problem spanning from turbulent flows and mechanical design to modelling the stock market. The crux of the matter invariably revolves around achieving a fine-grained discretisation capable of capturing the underlying phenomena or objects accurately. Conventional solvers, typified by finite element methods, perpetually pose a conundrum, demanding a delicate balance between computational load and precision. In recent years, an avant-garde solution to this age-old problem has emerged in the form of machine learning (ML), with deep neural networks. Herein, the solution to a PDE is approximated by a deep neural network, its parameters fine-tuned through training to minimise a devised loss function rooted in the essence of the PDE. This approach has, in the last few years, ignited a surge of interest within the mathematical community. It has not only outperformed classical techniques like finite element methods for certain categories of PDEs, notably high-dimensional ones, but has also excelled in specific numerical regimes, especially when swiftness in approximating a solution is imperative. One pioneering concept that has emerged along these lines is the notion of physics-informed neural networks (PINNs) [1,2]. However, PINNs grapple with the computational overhead associated with their training phase, entailing the resolution of a per-instance optimisation challenge. Conversely, another cluster of techniques has steered away from the intricacies of mesh-bound approaches, as exemplified in [3,4]. Recent strides [5,6] have approached the problem by casting it as a learning targeting infinite-dimensional operators - such as solution operators intrinsic to PDEs. |

Project Description | We propose two questions for investigation in this essay. Firstly, we hope that students will explore one of the aforementioned family of techniques, and in particular discuss their mathematical underpinning. Secondly, we seek to investigate better ways to use deep networks for solving PDEs that could alleviate drawbacks of current techniques. We also hope that students will discuss some open questions that they find interesting. |

Work Environment | The student will be fully integrated in the Cambridge Image Analysis group. The student be attending weekly seminars with the group, also discussing closely with PhD students. |

References | [1] Raissi, M., Perdikaris, P., & Karniadakis, G. E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 2019. [2] Pan, S., & Duraisamy, K. Physics-informed probabilistic learning of linear embeddings of nonlinear dynamics with guaranteed stability. SIAM Journal on Applied Dynamical Systems, 2020. [3] Lu, L., Jin, P., & Karniadakis, G. E. Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv:1910.03193, 2019. [4] Patel, R. G., Trask, N. A., Wood, M. A., & Cyr, E. C. A physics-informed operator regression framework for extracting data-driven continuum models. Computer Methods in Applied Mechanics and Engineering, 2021. [5] Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., & Anandkumar, A. Fourier neural operator for parametric partial differential equations. arXiv:2010.08895. [6] Kovachki, N., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., & Anandkumar, A. (2021). Neural operator: Learning maps between function spaces. arXiv:2108.08481. |

Prerequisite Skills | Numerical Analysis, PDEs, Mathematical Analysis |

Other Skills Used in the Project | |

Acceptable Programming Languages | Python |

## Asymptotic Decomposition of a Charged Scalar Field in de Sitter Space

Project Title | Asymptotic Decomposition of a Charged Scalar Field in de Sitter Space |

Keywords | general relativity, asymptotic expansions, conformal method, nonlinear PDE |

Project Listed | 18 January 2024 |

Contact Name | Grigalius Taujanskas |

Contact Email | gt306@cam.ac.uk |

Company/Lab/Department | DPMMS |

Project Duration | 8 weeks |

Project Open to | Part II Mathematical Tripos / third year undergraduates |

Background Information | Astronomical observations indicate that the cosmological constant in our universe is positive. The maximally symmetric solution to Einstein's equations with a positive cosmological constant is the de Sitter spacetime, and it is of interest to study the precise asymptotic structure of de Sitter-like solutions to Einstein's (and Einstein-matter) equations. In general this is a complex question, but one particularly amenable nonlinear model is that of a massless charged scalar field on de Sitter space. These equations are conformally invariant, which permits the use of the conformal method to relate the asymptotic behaviour of solutions to properties of initial data. |

Project Description | The project will begin by reviewing the Maxwell-scalar field system and the conformal method, and in particular the results obtained in [1]. The goal of the project will then be to investigate and prove the existence of a conjectured asymptotic expansion for the charged scalar field on de Sitter space. |

Work Environment | by themselves with my guidance; they can work from anywhere |

References | [1] G. Taujanskas, Conformal scattering of the Maxwell-scalar field system on de Sitter space, J. Hyperbolic Differ. Equ. 16 (04), 743--791 (2019) [2] R. Penrose and W. Rindler, Spinors and space-time vols 1 & 2, CUP [3] J.-P. Nicolas, The conformal approach to asymptotic analysis, arXiv:1508.02592 |

Prerequisite Skills | PDEs, general relativity |

Other Skills Used in the Project | PDEs, Mathematical Analysis, Geometry/Topology |

## Implementation of Multi-Scale Dense Networks for Computed Tomography Reconstruction

Project Title | Implementation of Multi-Scale Dense Networks for Computed Tomography Reconstruction |

Keywords | Machine Learning, Tomographic Reconstruction, Inverse problems, Applications, Programming |

Project Listed | 7 February 2024 |

Contact Name | Ander Biguri |

Contact Email | ab2860@cam.ac.uk |

Company/Lab/Department | DAMTP / Cambridge Image Analysis |

Project Duration | 8 weeks |

Project Open to | Part IB Mathematical Tripos / second year undergraduates Part II Mathematical Tripos / third year undergraduates |

Background Information |
The rise of Machine Learning (ML) means that many fields, including mathematics, have been exploring uses for solving complex non-convex optimization problems. In particular, this project touches on the use of ML for Computed Tomography (CT) reconstruction, an inverse problem with a very common application in medicine. The use of ML for the task of CT reconstruction is quite new, and therefore many methods exist in a seemingly chaotic state of the literature. Our research group Cambridge Image Analysis (CIA) (lead by Prof. Carola-Bibiane Schonlieb) has been an active part of the developments of these methods. Currently, a big push of the project is to provide a unified repository of methods that are comparable, with the aim of breaching the gap between Mathematics research and applications. Project named Learned Iterative Optimization Networks (LION, https://github.com/CambridgeCIA/LION). |

Project Description |
The task within the LION framework would be to study and implement a Mixed Scale-Dense (MS-D) network in the latest Python/Pytorch versions, and integrate it to LION. Then, to test its performance with supervised and unsupervised ML CT tasks, such as Noise2Inverse. The project would require to learn about ML models, in particular MS-D model, and successfully learn how these are implemented in practice. Then learning how to use medical data to train such models to produce successful reconstructions, and compare it to other existing methods. This project would teach a student everything they need to successfully train Machine Learning methods. It is a very applied project, in where most of the work will be programming and testing ML methods, rather than developing new mathematical formulations, but there is space to do that if the implementation is finalized in an early stage. |

Work Environment | Students will work directly with me, but may also interact with other people from the group. There is the option of asking for office space at CMS if requested early, but the standard would be for students to work remotely or from common areas of Uni/CMS. The hours would depend on the students availability, but I'd suggest 7h workdays during the weektime to have plenty of time to finalized the project. |

References |
https://www.pnas.org/doi/10.1073/pnas.1715832114 |

Prerequisite Skills | Image processing, Programming skills, preferably Python's numpy/pytorch libraries |

Other Skills Used in the Project | Machine Learning |

Acceptable Programming Languages | Python |

## General project outlines / possibilities

## Tangent space to the Hilbert schemes of points

Project Title | Tangent space to the Hilbert schemes of points |

Project Listed | 5 January 2024 |

Contact Name | Fatemeh Rezaee |

Contact Email | fr414@cam.ac.uk |

Company/Lab/Department | DPMMS |

Project Duration | Up to 8 weeks |

Project Description | Hilbert schemes carry a very rich geometry: there is no geometric possibility that it cannot be found generically on some component of some Hilbert scheme (Harris-Morrison). Hilbert schemes of points parametrize zero-dimensional subschemes of a given degree in a fixed ambient space. They have connections to many areas in mathematics, such as algebraic geometry, commutative algebra, combinatorics, representation theory, moduli theory of sheaves, toric geometry, convex geometry, etc. I have some conjectural statements on the tangent space to the Hilbert scheme of points having maximum dimension at a point. Coming up with an idea of proving any of the partial cases will lead to resolving a long-standing open problem on the powers of the maximal ideal. The ideas for the proofs are basically purely combinatorial and the student will not need to know much about algebraic geometry, though some familiarity will be useful. I would be happy to make the ideas more concrete together with an interested and strong student. |

References | 1) F. Rezaee, conjectural criteria for the most singular points of the Hilbert schemes of points. arXiv: 2312.04520. 2) R.RamkumarandA.Sammartano,OnthetangentspacetotheHilbertschemeofpointsinP3, Transactions of the American Mathematical Society 375 (2022), 6179–6203. 3) B. Sturmfels, Four Counterexamples in Combinatorial Algebraic Geometry, Journal of Algebra 230 (2000), 282–294. 4) J. Briancon and A. Iarrobino, DimensionofthepunctualHilbertscheme,J.Algebra55(1978),536–544. |

## Behrend function for higher dimensional monomial schemes

Project Title | Behrend function for higher dimensional monomial schemes |

Project Listed | 5 January 2024; Please contact me by 28 January if you are interested in this project |

Contact Name | Fatemeh Rezaee |

Contact Email | fr414@cam.ac.uk |

Company/Lab/Department | DPMMS |

Project Duration | Up to 8 weeks |

Project Description | Behrend function plays a critical role in enumerative geometry, for instance to compute Donaldson-Thomas invariants, etc; computing it is very subtle though. I have a beautiful combinatorial way that computes the Behrend function for zero dimensional schemes in any arbitrary dimensional ambient spaces in a very easy way. I would like to generalise the result to higher dimensional schemes given by a monomial ideal. I’m looking for a strong and enthusiastic student to work this out together. Familiarity with combinatorics or commutative algebra or algebraic geometry will be useful, but not necessary. |

References | 1) F. Rezaee, A quotient formula for the Behrend function and its combinatorial interpretation, preprint. 2) M. Graffeo, A. Ricolfi, On the Behrend function and the blowup of some fat points, Advances in Mathematics, Volume 415 (2023), 108896. |

## How will estuarine mixing and salinity evolve under a changing climate?

Project Title | How will estuarine mixing and salinity evolve under a changing climate? |

Project Listed | 9 January 2024 |

Contact Name | Adrien Lefauve |

Contact Email | aspl2@cam.ac.uk |

Company/Lab/Department | DAMTP / Environmental Fluid Dynamics |

Project Duration | Up to 8 weeks |

Project Description | This is an environmental fluid dynamics project to tackle how turbulent mixing between freshwater and saltwater in estuaries is expected to change under global climate change in the next decades. Estuaries are transition zones where fresh river water meets the ocean's saltwater, which are vital to modern society as we know it (~20 of the world's 30 largest cities are found on estuaries). The flow of water and salinity within estuaries affect our water supply, agriculture, fisheries, biodiversity and tourism to name a few. Climate change is expected to impact estuarine dynamics (e.g. through sea level rise, more frequent droughts), but the effects on the circulation and levels of mixing (ultimately setting the salinity levels) are yet unknown. This project will investigate this, possibly using global datasets, scaling laws representing key physical processes, and idealised fluid dynamical numerical models. The details remain to be decided in collaboration with the student. |

## Single-Shot Unrolling Algorithm for Inverse Problem

Project Title | Single-Shot Unrolling Algorithm for Inverse Problem |

Project Listed | 12 January 2024 |

Contact Name | Angelica Aviles-Rivero |

Contact Email | ai323@cam.ac.uk |

Company/Lab/Department | DAMTP |

Project Duration | Up to 8 weeks |

Project Description |
The emergence of deep learning has transformed the approach to optimisation problems in applied sciences and mathematics. The concept of algorithm unrolling is a focal point within the paradigm of learning to optimise (L2O) method, recognising the burgeoning interest due to their potential in crafting efficient, high-performance, and interpretable network architectures from reasonably sized training sets. It takes the form of x^{k+1} = f(x^k; \theta^k), for k = 1, 2, â€¦, K-1 [1,2], which unfolds a truncated optimisation algorithm within the architecture of a neural network. Traditional learning to optimise technique enjoys deep learning models with redundant training, while recent work[3] has introduced the single-shot concept in the field, which minimises the need for extensive training data. This concept aligns well with algorithm unrolling, as it necessitates end-to-end training. The aim of this project is to open a new research line for algorithm unrolling, drawing attention for minimal training data. The primary emphasis will involve exploring a new single-shot strategy for neural networks, followed by seamlessly integrating this innovative strategy into the algorithm unrolling framework. |

References | [1] Monga, V., Li, Y. and Eldar, Y.C., 2021. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Processing Magazine, 38(2), pp.18-44. [2] Chen, T., Chen, X., Chen, W., Heaton, H., Liu, J., Wang, Z. and Yin, W., 2022. Learning to optimize: A primer and a benchmark. Journal of Machine Learning Research, 23(189), pp.1-59. [3] Cheng, Y., Zhang, L., Shen, Z., Wang, S., Yu, L., Chan, R.H., Schönlieb, C.B. and Aviles-Rivero, A.I., 2023. Single-Shot Plug-and-Play Methods for Inverse Problems. arXiv preprint arXiv:2311.13682. |

## Cosmic Strings and Boson Stars

Project Title | Cosmic Strings and Boson Stars |

Project Listed | 12 January 2024 |

Contact Name | Amelia Drew |

Contact Email | ad652@cam.ac.uk |

Company/Lab/Department | DAMTP / Theoretical Cosmology |

Project Duration | Up to 8 weeks |

Project Description | Cosmic strings are a class of topological defect that are predicted by many beyond-the-Standard-Model theories in high energy physics. They are found from cylindrically symmetric solutions to the wave equation for a complex scalar field with a 'wine bottle' potential, and are formed due to a U(1) symmetry breaking. They are a potential source of axion dark matter, as well as gravitational waves. Boson stars are spherically symmetric solitonic solutions of the Einstein-Klein-Gordon equations with a similar potential. This project will investigate the mathematical links between these two phenomena to determine whether the literature on the two subjects regarding eg. their gravitational wave signatures or massive/massless radiation can be related, with the potential to find new applications. |

References | A. Vilenkin and E. P. S. Shellard, Cosmic Strings and Other Topological Defects (Cambridge University Press, Cambridge, England, 1994) A. Drew and E. P. S. Shellard, Radiation from Global Topological Strings using Adaptive Mesh Refinement: Methodology and Massless Modes, Phys. Rev. D 105, 063517 (2019) A. Drew and E. P. S. Shellard, Radiation from Global Topological Strings using Adaptive Mesh Refinement: Massive Modes, Phys. Rev. D 107, 043507 (2023). A. Drew, T. Kinowski and E. P. S. Shellard, Axion String Source Modelling, arXiv:2312.07701 (2023) D. J. Kaup, Phys. Rev. 172, 1331 (1968) (First derivation of boson stars) T. Helfer et al., Malaise and remedy of binary boson-star initial data, Class. Quant. Grav 39, 074001 (2022) (An example of recent boson star work) |

## Algebraic geometry, combinatorics, and machine learning

Project Title | Algebraic geometry, combinatorics, and machine learning |

Project Listed | 22 January 2024; Please make contact before 12 February if you are interested |

Contact Name | Fatemeh Rezaee |

Contact Email | fr414@cam.ac.uk |

Company/Lab/Department | DPMMS |

Project Duration | Up to 8 weeks |

Project Description | There are many situations in algebraic and enumerative geometry where one can partially find some closed formulae that calculate a specific number associated with algebro-geometric objects, using combinatorial methods. In other words, one can find some partial patterns, but to see a general pattern in the partially recognized patterns, we propose applying machine learning to specific and explicit examples. For this project, I am looking for a very strong student familiar with algebraic geometry and combinatorics, and some experience with machine learning. |

## Deep neural networks in physics-based inverse imaging problems

Project Title | Deep neural networks in physics-based inverse imaging problems |

Project Listed | 23 January 2024 |

Contact Name | Zak Shumaylov (PhD student), Carola-Bibiane Schönlieb |

Contact Email | zs334@cam.ac.uk |

Company/Lab/Department | DAMTP / Cambridge Image Analysis |

Project Duration | Up to 8 weeks |

Project Description |
Summer projects: Inverse problems are concerned with reconstructing unknown model parameters from indirect and noisy measurements and arise in practically all imaging applications. In the last decade, thanks to deep learning, processing of imaging data has undergone a paradigm shift from knowledge driven approaches, deriving imaging models from first principles, to purely data driven approaches, instead deriving models from data. Most inverse problems of interest are ill-posed and require appropriate mathematical treatment for recovering meaningful solutions and while purely data-driven learning have been able to achieve remarkable empirical success for image reconstruction, they often lack rigorous reconstruction guarantees. Of particular interest are therefore methods that operate at the interface of these paradigms and feature both a knowledge driven (mathematical modelling) and a data driven (machine learning) component. The precise details of the project will be tailored to the student's interest and skill set, but possible topics/projects for investigation include: * Acquisition (operator)-aware learned reconstruction [1,2] |

References | [1] Xia, Wenjun et al. “CT Reconstruction With PDF: Parameter-Dependent Framework for Data From Multiple Geometries and Dose Levels.” IEEE transactions on medical imaging vol. 40,11 (2021): 3065-3076. doi:10.1109/TMI.2021.3085839 [2] Adler, Jonas, and Ozan Öktem. "Solving ill-posed inverse problems using iterative deep neural networks." Inverse Problems 33.12 (2017): 124007. [3] M. Raissi et al. "Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations." Journal of Computational Physics, https://doi.org/10.1016/j.jcp.2018.10.045. [4] Mukherjee, S., Dittmer, S., Shumaylov, Z., Lunz, S., Öktem, O., & Schönlieb, C. B. (2020). "Learned convex regularizers for inverse problems." [5] Berk, Aaron, et al. "Deep Proximal Gradient Method for Learned Convex Regularizers." ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023 [6] Mukherjee, S., Dittmer, S., Shumaylov, Z., Lunz, S., Öktem, O., & Schönlieb, C. B. (2020). "Learned convex regularizers for inverse problems." [7] Shumaylov, Zakhar, et al. "Provably Convergent Data-Driven Convex-Nonconvex Regularization." arXiv:2310.05812 (2023). |

Prerequisite Skills | All of the projects proposed can be pursued from both programming and theory direction, however a level of programming (preferably Python) experience is required. Other than this, interest is the only thing needed! |

## Efficient Probabilistic Machine Learning using Newton's Identities

Project Title | Efficient Probabilistic Machine Learning using Newton's Identities |

Project Listed | 26 January 2024 |

Contact Name | Henry Moss |

Contact Email | hm493@cam.ac.uk |

Company/Lab/Department | DAMTP / Institute of Computing for Climate Science |

Project Duration | Up to 8 weeks |

Project Description |
Gaussian processes are powerful probabilistic machine learning models, with particular prominence among many scientific disciplines, like in cosmology, molecular search, and climate forecasting. One reason for their prominence is the possibility they provide of directly encoding prior scientific knowledge into otherwise purely data-driven machine learning tasks. In particular, The choice of the Gaussian processes’ kernel function controls the covariance of the learned processes, with specific choices of kernel corresponding to different physical prior knowledge, e.g. conservation laws, symmetries, smoothness, or the satisfaction of particular differential equations. However, when constructing predictive models for systems or quantities with high-dimensional inputs, particularly in scenarios with limited data, the necessity arises to incorporate additional structural assumptions. This project aims to explore methods for efficiently encoding such structures by leveraging well-established mathematical expressions, reminiscent of the renowned Girard-Newton identity. The selected student will become an expert in Bayesian ML ( e.g. Gaussian processes), while concurrently scouring the mathematical literature to identify relevant extensions to the Girard-Newton identity. A key part of the project will be in building open-source tooling to enable the use of the project’s outcomes by the wider climate community. |

## Majority Dynamics in Graphs

Project Title | Majority Dynamics in Graphs |

Project Listed | 29 January 2024 |

Contact Name | Marcelo Campos |

Contact Email | mc2482@cam.ac.uk |

Company/Lab/Department | DPMMS / Combinatorics |

Project Duration | Up to 8 weeks |

Project Description | Given any graph $G$ with colored vertices one can define the majority dynamics process as the following. At each round simultaneously every vertex will turn the color that appears the most in it's neighborhood, breaking ties at random. This process has been studied for random graphs and for initial coloring being random, but many open questions remain. We would be interested in knowing for a given instance of the process, how long does it run until it reaches an equilibrium? Does one color become dominant? |

References | Majority Dynamics and Aggregation of Information in Social Networks - Elchanan Mossel, Joe Neeman, Omer Tamuz Convergence, unanimity and disagreement in majority dynamics on unimodular graphs and random graphs - Itai Benjamini, Siu-On Chan, Ryan O'Donnell, Omer Tamuz, Li-Yang Tan |

## Upper bounds on third eigenvalues of adjacency matrices

Project Title | Upper bounds on third eigenvalues of adjacency matrices |

Project Listed | 16 February 2024 |

Contact Name | Jan Petr (PhD student in DPMMS with Prof Bela Bollobas) |

Contact Email | jp895@cam.ac.uk |

Company/Lab/Department | DPMMS |

Project Duration | Up to 8 weeks |

Project Description | In 1989 a flawed proof of the following was published: the $i$-th eigenvalue of the adjacency matrix of a graph of $n$ vertices is at most $\lfloor \frac{n}{i} \rfloor$. While this statement is true for $i \in \{1,2\}$, it does not hold in general whenever $i>3$. The main task of the summer project would be to pursue a tight upper bound on the third eigenvalue in terms of $n$. |

References | Liu, Ning: Unsolved Problems in Spectral Graph Theory, arXiv 2023 Nikiforov: Extrema of graph eigenvalues, Linear Algebra and its Applications, 2015 Linz: Improved Lower Bounds on the Extrema of Eigenvalues of Graphs, Graphs and Combinatorics, 2023 |