skip to content

Summer Research Programmes

 

Group Projects *New in 2024*

 

Specific Projects

 

General project outlines / possibilities

 

Group Projects

Modelling ocean carbon solutions

Project Title Modelling ocean carbon solutions
Keywords Fluid Dynamics, Simulation, Ocean, Biology, Climate
Project Listed 22 January 2024
Contact Name John Taylor
Contact Email jrt51@cam.ac.uk
Company/Lab/Department DAMTP, Ocean Dynamics Group
Project Duration 8 weeks
Project Open to Part IB Mathematical Tripos / second year undergraduates
Part II Mathematical Tripos / third year undergraduates
Background Information There is a growing consensus that strategies will be needed to remove carbon dioxide from the atmosphere to avoid the worst consequences of climate change. Due to their very large area and capacity to sequester carbon for long timescales, the ocean has received attention as a possible venue for carbon dioxide removal (CDR). Various ocean CDR strategies have been proposed including approaches based on seawater chemistry (e.g. enhanced rock weathering) and biology (e.g. growing kelp forests). However, the ocean is a very dynamic fluid, and the interactions with the flow physics and the carbon and biological systems remain poorly understood and under-sampled.
Project Description This project would be for two students to work on a group project to help develop, test, and apply a computational tool to simulate the fluid dynamics and biological and chemical interactions at play in ocean CDR applications. This would use a code currently under development called OceanBioME (https://github.com/OceanBioME/OceanBioME.jl). OceanBioME uses the Julia proramming language and is designed to run very quickly on GPUs. Students could chose to focus more on code development, or using existing capabilities of OceanBioME appled to specific problems. Prior familiarity with the Julia programming language would be benficial, but is not required. Since this is a group project, it is essential that the student is willing to work collaboratively as part of a team.
Work Environment This is a group project, and students will work closely with PhD students and postdocs in DAMTP. The students will also have the opportunity to engage with other summer students working in the same general area through the Centre for Climate Repair. The team will be supervised by Prof. Taylor.
References https://github.com/OceanBioME/OceanBioME.jl
Prerequisite Skills Fluids, PDEs
Other Skills Used in the Project Fluids, Numerical Analysis, Simulation, Predictive Modelling, Data Visualization
Acceptable Programming Languages Julia

 

Distribution Shift in Statistics and Machine Learning

Project Title Distribution Shift in Statistics and Machine Learning
Keywords Statistics, Machine Learning
Project Listed 23 January 2024
Contact Name Rajen Shah
Contact Email rds37@cam.ac.uk
Company/Lab/Department DPMMS / Statistical Laboratory
Project Duration 8 weeks
Project Open to Part IB Mathematical Tripos / second year undergraduates
Part II Mathematical Tripos / third year undergraduates
Background Information The assumption that data are i.i.d. is ubiquitous in Statistics, both in the design of methodology and their theoretical analysis. However, if for instance, data have been collected in different environments, this assumption may not always be satisfied. Instead, the distribution of training and test data may be very different, and as a consequence, for example in industrial applications the real-world performance of machine learning methods may be noticeably worse than expected. Similarly in the sciences, the failure that many results fail to replicate in follow-up experiments conducted in a different laboratory say, is widely recognised and often attributed to distributional shifts. This is an issue of genuine practical importance, and there is now a rapidly growing body literature in both Statistics and Machine Learning attempting to develop new methods and theory to tackle these problems.
Project Description A first step towards addressing the issues is to understand what features of the distributional change are responsible for results not replicating or degradation of performance. One possible aim of this project would be to develop and evaluate a method for testing whether the conditional distribution of the response Y, given the predictors X, is constant or not across datasets. This problem is related to the important problem of testing conditional independence, and methods for the latter may be re-purposed for testing for this form of potential distributional shift. Another possible direction would be to consider cases where the distribution of the predictors X is different among the train and test data, while the regression function itself is constant. Fitting to the training data alone may then be suboptimal, and it would be desirable to tailor the performance of a fitted model to the test dataset of interest. Indeed, such a strategy often underlies the strong performance of successful Kaggle entries. To address this in a more principled way, one could look at modifying the tree-splitting criterion for a decision tree-based learner, to better account for test data, for example.
Work Environment Group project for 2-3 students working together.
References Ying Jin, Kevin Guo and Dominik Rothenhäusler (2023). Diagnosing the role of observable distribution shift in scientific replications https://arxiv.org/abs/2309.01056
Dominik Rothenhäusler and Peter Bühlmann (2023). Distributionally robust and generalizable inference. Statistical Science. https://arxiv.org/abs/2209.09352
Lundborg, A. R., Kim, I., Shah, R. D. and Samworth, R. J. (2022) The Projected Covariance Measure for assumption-lean variable significance testing. https://arxiv.org/abs/2211.02039
Shah, R. D. and Peters, J. (2020) The hardness of conditional independence and the generalised covariance measure. Annals of Statistics, 48(3), 1514-1538. https://arxiv.org/abs/1804.07203

 

Surfaces, graphs and commutators

Project Title Surfaces, graphs and commutators
Keywords Algebra, Differential Geometry and Topology
Project Listed 7 February 2024
Contact Name Henry Wilton
Contact Email hjrw2@cam.ac.uk
Company/Lab/Department Henry Wilton (DPMMS)
Project Duration 8 weeks
Project Open to Part II Mathematical Tripos / third year undergraduates
Background Information

A *commutator* is an element [a,b]:=aba^{-1}b^{-1} in a group G. The notation num(g) denotes the number of ways that an element g can be written as a commutator. (The full definition involves an equivalence relation, and is not completely obvious.) The purpose of this project is to investigate the behaviour of num(g) in the case of a non-abelian free group F. More precisely, we will investigate the maximum M(F)=max(num(g)) as g ranges over all non-trivial elements of F.

The first result about M(F) is a 1981 theorem of Lyndon—Wicks, who proved that M(F) is at least 2. This remains the only known lower bound.

The first upper bound comes from a general theorem of Sela from around 2000, which implies that M(F) is finite. However, the proof is non-constructive, and gives no finite upper bound. Beyond this, the state of the art is an unpublished result of Louder, who showed that M(F) <= ~10,000.

Project Description

The goal of this project is to investigate what’s known about M(F), and to try to improve the best known bounds. Although the definition of M(F) is algebraic, it can be rephrased in terms of maps from surfaces (specifically, a torus with one boundary component) to graphs. The techniques used will probably be combinatorial and geometric, involving drawing pictures of labelled graphs. Some experience with coding may also be useful in trying to improve Louder’s upper bound.

Work Environment This will be a group project, so there is room for several students to work on it, but they need to be willing to work in a group. Preference will be given to students who want to be present in Cambridge for the duration of the project.
References

Background reading on free groups and graphs:
Stallings, J.R. Topology of finite graphs. Invent Math 71, 551–565 (1983).

The state of the art on this problem:
Bestvina and Feighn, Counting maps from a surface to a graph, https://arxiv.org/abs/math/0505363v1

 

Specific Projects

Distinguishing mechanisms driving accumulation of stochastic swimmers near or away from surfaces

Project Title Distinguishing mechanisms driving accumulation of stochastic swimmers near or away from surfaces
Keywords Biofilm; Microswimmers; Diffusion; Fluid Dynamics; Mathematical Biology
Project Listed 5 January 2024
Contact Name Lloyd Fung
Contact Email lsf27@cam.ac.uk
Company/Lab/Department DAMTP / Eric Lauga's group
Project Duration 8 weeks
Project Open to Part II Mathematical Tripos / third year undergraduates
Background Information The phenomenon of motile microorganisms in fluids, such as bacteria, congregating close to (or away from) surfaces has been widely observed in many different contexts. The surface accumulation or depletion of these motile microorganisms, or microswimmers, is often attributed to drive biofouling and biofilm formation. Biofouling is a major issue in marine engineering, as they increase drag on ships and may even damage its structure. Meanwhile, biofilm is involved in a wide variety of microbial infections in the body. Their resistance to anti-microbial drugs, antiseptics or outright physical have always been a major challenge in healthcare.
Project Description

The phenomenon of motile microorganisms in fluids, such as bacteria, congregating close to (or away from) surfaces has been widely observed in many different contexts. The surface accumulation or depletion of these motile microorganisms, or microswimmers, is often attributed to drive biofouling and biofilm formation. While it seems to be a generic behaviour among many species, the specific processes underlying this behaviour can vary depending on the flow condition, type of surfaces and the microorganisms' swimming mechanism, speed and size. Mechanisms that can give rise to surface accumulation include, but are not limited to, biological taxes, far-field hydrodynamic interactions with the surface [1], coupling of stochastic swimming direction and flow [2], coupling between diffusion and swimming [3], and shape of the microswimmer and the wall [4]. Despite the distinct length scales and timescales associated with each process, a comprehensive study encompassing all these mechanisms and a definitive guideline for discerning between them based on observations are still lacking.

In this project, student will be given an existing code (in JULIA) that simulate microswimmers in a long, narrow channel with a Poiseuille background flow. The first part of the project is to understand the code and replicate the result of [2] and [3]. We shall then study how varying parameters such as channel width, microswimmers' size, shape and stochasticity can give rise to or suppress different accumulation mechanism. Lastly, the student is expected to write up a short report on how to distinguish different mechanisms driving accumulation based on given parameters.

This project is designed for a student with a strong background or interest in coding, numerics and stochastic processes. While prerequisite to JULIA is optional, the student embarking on this project is expected to learn the language and the Git workflow on the fly. Students who have pre-existing knowledge on Stochastic Ordinary Differential Equations (SDEs), Brownian motion and diffusion, and fluid dynamics are preferred.

Work Environment Student expected to work closely with the supervisor. Remote work is possible, but weekly meetings with the supervisor is expected to be physical.
References

1. Lauga, E. The Fluid Dynamics of Cell Motility
2. Vennamneni, L., Nambiar, S. & Subramanian, G. (2020). J. Fluid Mech., 890, A15.
3. Ezhilan, B. & Saintillan, D. (2015). J. Fluid Mech., 777, pp. 482-522.
4. Chen, H. & Thiffeault, J.-L. (2021). J. Fluid Mech., 916, p. A15.

Prerequisite Skills Probability/Markov Chains, Numerical Analysis
Other Skills Used in the Project Fluids, Simulation, Data Visualization
Acceptable Programming Languages JULIA

 

Behrend function for higher dimensional schemes via elementary mathematics

Project Title Behrend function for higher dimensional schemes via elementary mathematics
Keywords Behrend function, enumerative geometry, Euclidean geometry of parallelepipeds, monomial ideals, combinatorics
Project Listed 5 January 2024
Contact Name Fatemeh Rezaee
Contact Email fr414@cam.ac.uk
Company/Lab/Department DPMMS
Project Duration 8 weeks
Project Open to Part IB Mathematical Tripos / second year undergraduates
Part II Mathematical Tripos / third year undergraduates
Background Information Behrend function (introduced by Kai Behrend) plays a crucial role in enumera tive geometry; in particular, Behrend function of zero-dimensional subschemes is crucial in counting invariants in enumerative theories with symmetric perfect obstructions theories. However, this function is infamous for having unpredictable behaviour; hence it has been very hard to compute it, even in low dimensions. I provided a novel, practical and elementary way to compute the Behrend function of zero-dimensional ideals in any number of variables; the story becomes more interesting when it comes to monomial ideals, which comes with a beautiful intuition. Reversely, we apply this intuition to present some interesting combinatorial results.
Project Description The aim is to generalise the result for the zero-dimensional case as above to higher dimensional schemes given by a monomial ideal via purely combinatorial and novel elementary methods. I’m looking for a strong and enthusiastic student to work this out together. This will give a powerful and novel tool to compute such a subtle function, which has application in enumerative geometry for specific invariants (which in turn have applications in string theory and physics) that cannot be calculated with any other tools.
Work Environment Flexible
References

Papers below are not required as we use elementary methods (books will be useful though), but I just provide some references just in case the student wants to have a look at some texts in the literature to motivate what a subtle object we propose to compute via elementary tools (4-7).

The first three are books for commutative algebra and toric geometry background. Others are more advanced papers (the last two are more relevant).

1. M.F. Atiyah, .G.MacDonald,Introduction to commutative algebra,Addison-Wesley Publishing Company, 1969.
2. David Eisenbud, Commutative algebra, Graduate Text in Mathematics,vol.150,Springer-Verlag,1995.
3. David A. Cox, John B. Little, Henry K. Schenck: “Toric Varieties”
4. Kai Behrend, Donaldson–Thomas type invariants via microlocal geometry, Ann. of Math. 2 (2009), no. 170, 1307–1338.
5. Kai Behrend and Barbara Fantechi, The intrinsic normal cone, Invent.Math.128(1997),no.1,45–88.
6. F. Rezaee, A quotient formula for the Behrend function and its combinatorial interpretation, preprint (available upon request)
7. M. Graffeo, A. Ricolfi, On the Behrend function and the blowup of some fat points, Advances in Mathematics, Volume 415 (2023), 108896.

Prerequisite Skills Geometry/Topology, Algebra/Number Theory
Other Skills Used in the Project Combinatorics, commutative algebra, toric geometry
Acceptable Programming Languages None Required

 

Spectral methods for time-dependent PDEs

Project Title Spectral methods for time-dependent PDEs
Keywords Spectral methods, Orthogonal systems, Orthogonal polynomials
Project Listed 9 January 2024
Contact Name Arieh Iserles
Contact Email ai10@cam.ac.uk
Company/Lab/Department DAMTP
Project Duration 6 weeks
Project Open to Part IB Mathematical Tripos / second year undergraduates
Part II Mathematical Tripos / third year undergraduates
Background Information Covers very recent developments on the numerical analysis of time-dependent PDEs
Project Description

While spectral methods for time-dependent PDEs have been around for a long time, only recently we are gaining broader understanding on their design and analysis. This will be the subject matter of this project.

A spectral method starts by specifying an orthonormal basis of the underlying Hilbert space – once this is done, we can convert a time-dependent PDE into an infinite-dimensional ODE system by Galerkin conditions and solve its finite-dimensional truncation. Thus, everything starts with an orthonormal basis, which need be selected while balancing a number of criteria:
• Stability and well-posedness
• Fast convergence
• Preservation of qualitative features, e.g. mass or dissipativity
• Ease of expansion
• Ease of linear algebra computations

The project will survey a number of contemporary approaches in the choice of orthonormal bases, in particular T-systems and W-systems. Otherwise it is fairly open ended and might pursue a number of possible directions, depending on the interests of the student. It might be theoretical or computational in flavour and might explore some open problems, in particular applications to the computation of dispersive equations of quantum mechanics.

Work Environment On their own + my mentoring
References [1] Jan Hesthaven, David Gottlieb & Sigal Gottlieb, Spectral Methods for Time-Dependent Problems, Cambridge University Press (2007).
[2] Arieh Iserles & Marcus Webb, "Orthogonal systems with a skew-symmetric differentiation matrix, Found. Comp. Maths 19 (2019), 1191–1221.
[3] Arieh Iserles & Marcus Webb, "Fast computation of orthogonal systems with a skew-symmetric differentiation matrix", Comm. Pure & Appld Maths 74 (2021), 478–506.
[4] Arieh Iserles, "Stable spectral methods for time-dependent problems and the preservation of structure" Found. Comp. Maths (2024), to appear (copy available from Prof. Iserles).
Prerequisite Skills Numerical Analysis, Mathematical Analysis
Other Skills Used in the Project Mathematical physics
Acceptable Programming Languages No Preference

 

What is the statistical price for computational speed?

Project Title What is the statistical price for computational speed?
Keywords Statistical-computational trade-off, minimax lower bounds, low-degree polynomials
Project Listed 10 January 2024
Contact Name Richard Samworth
Contact Email rjs57@cam.ac.uk
Company/Lab/Department DPMMS / Stats Lab
Project Duration 8 weeks
Project Open to Part IB Mathematical Tripos / second year undergraduates
Part II Mathematical Tripos / third year undergraduates
Background Information Minimax lower bounds provide fundamental limits on how well a signal can be estimated from noisy observations.  When this bound is matched by a specific estimation procedure, we have a quantification of the difficulty of the particular problem at hand.  Unfortunately, when the signal to be estimated is high-dimensional and may be sparse --- as is common in modern settings --- estimation procedures that match the minimax lower bound may require untenable amounts of computation.  Over the last decade, a body of evidence has been collected to indicate that several estimation problems seem to exhibit a fascinating statistical-computational gap in which there is a distinction in the minimax rate of estimation and the computationally-feasible rate of estimation. 
Project Description This project aims to develop tools to obtain computationally-constrained minimax lower bounds.  In particular, we aim to develop analogues of classical methods to obtain minimax lower bounds, such as Fano's or Assouad's method, when restricted to estimators that are low degree polynomials of the observed data.
Work Environment The project will be jointly supervised with my post-doc Kabir Verchand and PhD student Manuel Mueller. The student will be part of my research group, which meets on Monday morning in person, and as part of an extended group meeting on Tuesday afternoons. They may wish to participate in the Statistics Clinic (which happens around once a month out of term). The four of us would meet weekly to discuss the project. Ideally the student would work in the CMS. The project is more suitable for a finishing third-year undergraduate, but may be accessible to an exceptional finishing second-year.
References Samworth and Shah (2024+) Modern Statistical Methods and Theory, Chapter 9 (available from the setter) Wainwright (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint, Chapter 15. Cambridge University Press
Schramm and Wein (2022) Computational Barriers to Estimation from Low-Degree Polynomials, Annals of Statistics, 50, 1833-1858.
Auddy and Yuan (2023) Large Dimensional Independent Component Analysis: Statistical Optimality and Computational Tractability. arXiv preprint arXiv:2303.18156.
Luo and Gao (2023) Computational lower bounds for graphon estimation via low-degree polynomials. arXiv preprint arXiv:2308.15728, 2023.
Prerequisite Skills Statistics, Mathematical Analysis
Other Skills Used in the Project  
Acceptable Programming Languages Python, R

 

Missing data in high dimensions

Project Title Missing data in high dimensions
Keywords Missing data, multiple imputation
Project Listed 10 January 2024
Contact Name Richard Samworth
Contact Email rjs57@cam.ac.uk
Company/Lab/Department Richard Samworth (DPMMS / Stats Lab)
Project Duration 8 weeks
Project Open to Part IB Mathematical Tripos / second year undergraduates
Part II Mathematical Tripos / third year undergraduates
Background Information When faced with missing data, a popular strategy consists of first filling-in, or imputing, the missing entries, and subsequently using a method designed for complete data.  Multiple imputation, broadly construed, consists of repeating the aforementioned procedure several times and aggregating the results.  In the classical regime in which the dimension of the data is fixed and the number of observations is taken to infinity, the behaviour of several multiple imputation procedures is well understood.  Unfortunately, these typically require an initial, consistent estimator, such as a complete-case estimate, which may be unavailable when the dimension of the data is large as compared to the number of observations. 
Project Description This project will study multiple imputation in the high-dimensional setting.  In particular, we adopt a simplistic assumption on the distribution of the covariates and the mechanism by which data is missing, and aim to provide sharp guarantees for the estimation error of multiple imputation.  To provide these sharp estimates, we will employ tools such as Gordon's convex Gaussian minimax theorem (CGMT), a leave one out method due to El Karoui, and universality properties of high-dimensional M-estimation procedures.  
Work Environment The project will be jointly supervised with my post-doc Kabir Verchand. The student will be part of my research group, which meets on Monday morning in person, and as part of an extended group meeting on Tuesday afternoons. They may wish to participate in the Statistics Clinic (which happens around once a month out of term). The three of us would meet weekly to discuss the project. Ideally the student would work in the CMS. The project is more suitable for a finishing third-year undergraduate, but may be accessible to an exceptional finishing second-year.
References van der Vaart (1998) Asymptotic Statistics, Chapter 25. Cambridge University Press.
Thrampoulidis, Oymak, Hassibi (2015) Regularized linear regression: A precise analysis of the estimation error. Conference on Learning Theory.
El Karoui, Bean, Bickel, Lim and Yu (2013) On robust regression with high-dimensional predictors. PNAS, 110.36, 14557-14562.
Wang and Robins (1998) Large-sample theory for parametric multiple imputation procedures. Biometrika, 85, 935-948.
Han and Shen (2023) Universality of regularized regression estimators in high dimensions. Annals of Statistics, 51, 1799-1823.
Prerequisite Skills Statistics
Other Skills Used in the Project Simulation
Acceptable Programming Languages Python, R

 

PDEs Through Deep Learning

Project Title PDEs Through Deep Learning
Keywords PDEs, Deep Learning, Neural Operators, Physics-Informed Networks
Project Listed 12 January 2024
Contact Name Angelica Aviles-Rivero
Contact Email ai323@cam.ac.uk
Company/Lab/Department DAMTP / Cambridge Image Analysis
Project Duration 8 weeks
Project Open to Part IB Mathematical Tripos / second year undergraduates 
Part II Mathematical Tripos / third year undergraduates
Background Information

The realm of numerical partial differential equation (PDE) solutions remains a perpetual focal point in the domain of numerical analysis, brimming with mathematical intricacies. This is most pronounced when tackling nonlinear PDEs, PDEs residing in high-dimensional spaces, and PDEs defined over intricate domains - scenarios frequently encountered in real-world problem spanning from turbulent flows and mechanical design to modelling the stock market. The crux of the matter invariably revolves around achieving a fine-grained discretisation capable of capturing the underlying phenomena or objects accurately.

Conventional solvers, typified by finite element methods, perpetually pose a conundrum, demanding a delicate balance between computational load and precision. In recent years, an avant-garde solution to this age-old problem has emerged in the form of machine learning (ML), with deep neural networks. Herein, the solution to a PDE is approximated by a deep neural network, its parameters fine-tuned through training to minimise a devised loss function rooted in the essence of the PDE. This approach has, in the last few years, ignited a surge of interest within the mathematical community. It has not only outperformed classical techniques like finite element methods for certain categories of PDEs, notably high-dimensional ones, but has also excelled in specific numerical regimes, especially when swiftness in approximating a solution is imperative.

One pioneering concept that has emerged along these lines is the notion of physics-informed neural networks (PINNs) [1,2]. However, PINNs grapple with the computational overhead associated with their training phase, entailing the resolution of a per-instance optimisation challenge. Conversely, another cluster of techniques has steered away from the intricacies of mesh-bound approaches, as exemplified in [3,4]. Recent strides [5,6] have approached the problem by casting it as a learning targeting infinite-dimensional operators - such as solution operators intrinsic to PDEs.

Project Description We propose two questions for investigation in this essay. Firstly, we hope that students will explore one of the aforementioned family of techniques, and in particular discuss their mathematical underpinning. Secondly, we seek to investigate better ways to use deep networks for solving PDEs that could alleviate drawbacks of current techniques. We also hope that students will discuss some open questions that they find interesting.
Work Environment The student will be fully integrated in the Cambridge Image Analysis group. The student be attending weekly seminars with the group, also discussing closely with PhD students.
References [1] Raissi, M., Perdikaris, P., & Karniadakis, G. E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 2019.
[2] Pan, S., & Duraisamy, K. Physics-informed probabilistic learning of linear embeddings of nonlinear dynamics with guaranteed stability. SIAM Journal on Applied Dynamical Systems, 2020.
[3] Lu, L., Jin, P., & Karniadakis, G. E. Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv:1910.03193, 2019.
[4] Patel, R. G., Trask, N. A., Wood, M. A., & Cyr, E. C. A physics-informed operator regression framework for extracting data-driven continuum models. Computer Methods in Applied Mechanics and Engineering, 2021.
[5] Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., & Anandkumar, A. Fourier neural operator for parametric partial differential equations. arXiv:2010.08895.
[6] Kovachki, N., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., & Anandkumar, A. (2021). Neural operator: Learning maps between function spaces. arXiv:2108.08481.
Prerequisite Skills Numerical Analysis, PDEs, Mathematical Analysis
Other Skills Used in the Project  
Acceptable Programming Languages Python

 

Asymptotic Decomposition of a Charged Scalar Field in de Sitter Space

Project Title Asymptotic Decomposition of a Charged Scalar Field in de Sitter Space
Keywords general relativity, asymptotic expansions, conformal method, nonlinear PDE
Project Listed 18 January 2024
Contact Name Grigalius Taujanskas
Contact Email gt306@cam.ac.uk
Company/Lab/Department DPMMS
Project Duration 8 weeks
Project Open to Part II Mathematical Tripos / third year undergraduates
Background Information Astronomical observations indicate that the cosmological constant in our universe is positive. The maximally symmetric solution to Einstein's equations with a positive cosmological constant is the de Sitter spacetime, and it is of interest to study the precise asymptotic structure of de Sitter-like solutions to Einstein's (and Einstein-matter) equations. In general this is a complex question, but one particularly amenable nonlinear model is that of a massless charged scalar field on de Sitter space. These equations are conformally invariant, which permits the use of the conformal method to relate the asymptotic behaviour of solutions to properties of initial data.
Project Description The project will begin by reviewing the Maxwell-scalar field system and the conformal method, and in particular the results obtained in [1]. The goal of the project will then be to investigate and prove the existence of a conjectured asymptotic expansion for the charged scalar field on de Sitter space.
Work Environment by themselves with my guidance; they can work from anywhere
References [1] G. Taujanskas, Conformal scattering of the Maxwell-scalar field system on de Sitter space, J. Hyperbolic Differ. Equ. 16 (04), 743--791 (2019)
[2] R. Penrose and W. Rindler, Spinors and space-time vols 1 & 2, CUP
[3] J.-P. Nicolas, The conformal approach to asymptotic analysis, arXiv:1508.02592
Prerequisite Skills PDEs, general relativity
Other Skills Used in the Project PDEs, Mathematical Analysis, Geometry/Topology

 

Implementation of Multi-Scale Dense Networks for Computed Tomography Reconstruction

Project Title Implementation of Multi-Scale Dense Networks for Computed Tomography Reconstruction
Keywords Machine Learning, Tomographic Reconstruction, Inverse problems, Applications, Programming
Project Listed 7 February 2024
Contact Name Ander Biguri
Contact Email ab2860@cam.ac.uk
Company/Lab/Department DAMTP / Cambridge Image Analysis
Project Duration 8 weeks
Project Open to Part IB Mathematical Tripos / second year undergraduates
Part II Mathematical Tripos / third year undergraduates
Background Information

The rise of Machine Learning (ML) means that many fields, including mathematics, have been exploring uses for solving complex non-convex optimization problems. In particular, this project touches on the use of ML for Computed Tomography (CT) reconstruction, an inverse problem with a very common application in medicine. The use of ML for the task of CT reconstruction is quite new, and therefore many methods exist in a seemingly chaotic state of the literature.

Our research group Cambridge Image Analysis (CIA) (lead by Prof. Carola-Bibiane Schonlieb) has been an active part of the developments of these methods. Currently, a big push of the project is to provide a unified repository of methods that are comparable, with the aim of breaching the gap between Mathematics research and applications. Project named Learned Iterative Optimization Networks (LION, https://github.com/CambridgeCIA/LION).

Project Description

The task within the LION framework would be to study and implement a Mixed Scale-Dense (MS-D) network in the latest Python/Pytorch versions, and integrate it to LION. Then, to test its performance with supervised and unsupervised ML CT tasks, such as Noise2Inverse.

The project would require to learn about ML models, in particular MS-D model, and successfully learn how these are implemented in practice. Then learning how to use medical data to train such models to produce successful reconstructions, and compare it to other existing methods.

This project would teach a student everything they need to successfully train Machine Learning methods. It is a very applied project, in where most of the work will be programming and testing ML methods, rather than developing new mathematical formulations, but there is space to do that if the implementation is finalized in an early stage.

Work Environment Students will work directly with me, but may also interact with other people from the group. There is the option of asking for office space at CMS if requested early, but the standard would be for students to work remotely or from common areas of Uni/CMS. The hours would depend on the students availability, but I'd suggest 7h workdays during the weektime to have plenty of time to finalized the project.
References

https://www.pnas.org/doi/10.1073/pnas.1715832114
https://arxiv.org/abs/2001.11801

Prerequisite Skills Image processing, Programming skills, preferably Python's numpy/pytorch libraries
Other Skills Used in the Project Machine Learning
Acceptable Programming Languages Python

 

General project outlines / possibilities

Tangent space to the Hilbert schemes of points

Project Title Tangent space to the Hilbert schemes of points
Project Listed 5 January 2024
Contact Name Fatemeh Rezaee
Contact Email fr414@cam.ac.uk
Company/Lab/Department DPMMS
Project Duration Up to 8 weeks
Project Description Hilbert schemes carry a very rich geometry: there is no geometric possibility that it cannot be found generically on some component of some Hilbert scheme (Harris-Morrison). Hilbert schemes of points parametrize zero-dimensional subschemes of a given degree in a fixed ambient space. They have connections to many areas in mathematics, such as algebraic geometry, commutative algebra, combinatorics, representation theory, moduli theory of sheaves, toric geometry, convex geometry, etc. I have some conjectural statements on the tangent space to the Hilbert scheme of points having maximum dimension at a point. Coming up with an idea of proving any of the partial cases will lead to resolving a long-standing open problem on the powers of the maximal ideal. The ideas for the proofs are basically purely combinatorial and the student will not need to know much about algebraic geometry, though some familiarity will be useful. I would be happy to make the ideas more concrete together with an interested and strong student.
References 1) F. Rezaee, conjectural criteria for the most singular points of the Hilbert schemes of points. arXiv: 2312.04520.
2) R.RamkumarandA.Sammartano,OnthetangentspacetotheHilbertschemeofpointsinP3, Transactions of the American Mathematical Society 375 (2022), 6179–6203.
3) B. Sturmfels, Four Counterexamples in Combinatorial Algebraic Geometry, Journal of Algebra 230 (2000), 282–294.
4) J. Briancon and A. Iarrobino, DimensionofthepunctualHilbertscheme,J.Algebra55(1978),536–544.

 

Behrend function for higher dimensional monomial schemes

Project Title Behrend function for higher dimensional monomial schemes
Project Listed 5 January 2024; Please contact me by 28 January if you are interested in this project
Contact Name Fatemeh Rezaee
Contact Email fr414@cam.ac.uk
Company/Lab/Department DPMMS
Project Duration Up to 8 weeks
Project Description Behrend function plays a critical role in enumerative geometry, for instance to compute Donaldson-Thomas invariants, etc; computing it is very subtle though. I have a beautiful combinatorial way that computes the Behrend function for zero dimensional schemes in any arbitrary dimensional ambient spaces in a very easy way. I would like to generalise the result to higher dimensional schemes given by a monomial ideal. I’m looking for a strong and enthusiastic student to work this out together. Familiarity with combinatorics or commutative algebra or algebraic geometry will be useful, but not necessary.
References 1) F. Rezaee, A quotient formula for the Behrend function and its combinatorial interpretation, preprint.
2) M. Graffeo, A. Ricolfi, On the Behrend function and the blowup of some fat points, Advances in Mathematics, Volume 415 (2023), 108896.

 

How will estuarine mixing and salinity evolve under a changing climate?

Project Title How will estuarine mixing and salinity evolve under a changing climate?
Project Listed 9 January 2024
Contact Name Adrien Lefauve
Contact Email aspl2@cam.ac.uk
Company/Lab/Department DAMTP / Environmental Fluid Dynamics
Project Duration Up to 8 weeks
Project Description This is an environmental fluid dynamics project to tackle how turbulent mixing between freshwater and saltwater in estuaries is expected to change under global climate change in the next decades. Estuaries are transition zones where fresh river water meets the ocean's saltwater, which are vital to modern society as we know it (~20 of the world's 30 largest cities are found on estuaries). The flow of water and salinity within estuaries affect our water supply, agriculture, fisheries, biodiversity and tourism to name a few. Climate change is expected to impact estuarine dynamics (e.g. through sea level rise, more frequent droughts), but the effects on the circulation and levels of mixing (ultimately setting the salinity levels) are yet unknown. This project will investigate this, possibly using global datasets, scaling laws representing key physical processes, and idealised fluid dynamical numerical models. The details remain to be decided in collaboration with the student.

 

Single-Shot Unrolling Algorithm for Inverse Problem

Project Title Single-Shot Unrolling Algorithm for Inverse Problem
Project Listed 12 January 2024
Contact Name Angelica Aviles-Rivero
Contact Email ai323@cam.ac.uk
Company/Lab/Department DAMTP
Project Duration Up to 8 weeks
Project Description

The emergence of deep learning has transformed the approach to optimisation problems in applied sciences and mathematics. The concept of algorithm unrolling is a focal point within the paradigm of learning to optimise (L2O) method, recognising the burgeoning interest due to their potential in crafting efficient, high-performance, and interpretable network architectures from reasonably sized training sets. It takes the form of x^{k+1} = f(x^k; \theta^k), for k = 1, 2, …, K-1 [1,2], which unfolds a truncated optimisation algorithm within the architecture of a neural network. Traditional learning to optimise technique enjoys deep learning models with redundant training, while recent work[3] has introduced the single-shot concept in the field, which minimises the need for extensive training data. This concept aligns well with algorithm unrolling, as it necessitates end-to-end training.

The aim of this project is to open a new research line for algorithm unrolling, drawing attention for minimal training data. The primary emphasis will involve exploring a new single-shot strategy for neural networks, followed by seamlessly integrating this innovative strategy into the algorithm unrolling framework.

References [1] Monga, V., Li, Y. and Eldar, Y.C., 2021. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Processing Magazine, 38(2), pp.18-44.
[2] Chen, T., Chen, X., Chen, W., Heaton, H., Liu, J., Wang, Z. and Yin, W., 2022. Learning to optimize: A primer and a benchmark. Journal of Machine Learning Research, 23(189), pp.1-59.
[3] Cheng, Y., Zhang, L., Shen, Z., Wang, S., Yu, L., Chan, R.H., Schönlieb, C.B. and Aviles-Rivero, A.I., 2023. Single-Shot Plug-and-Play Methods for Inverse Problems. arXiv preprint arXiv:2311.13682.

 

Cosmic Strings and Boson Stars

Project Title Cosmic Strings and Boson Stars
Project Listed 12 January 2024
Contact Name Amelia Drew
Contact Email ad652@cam.ac.uk
Company/Lab/Department DAMTP / Theoretical Cosmology
Project Duration Up to 8 weeks
Project Description Cosmic strings are a class of topological defect that are predicted by many beyond-the-Standard-Model theories in high energy physics. They are found from cylindrically symmetric solutions to the wave equation for a complex scalar field with a 'wine bottle' potential, and are formed due to a U(1) symmetry breaking. They are a potential source of axion dark matter, as well as gravitational waves. Boson stars are spherically symmetric solitonic solutions of the Einstein-Klein-Gordon equations with a similar potential. This project will investigate the mathematical links between these two phenomena to determine whether the literature on the two subjects regarding eg. their gravitational wave signatures or massive/massless radiation can be related, with the potential to find new applications.
References A. Vilenkin and E. P. S. Shellard, Cosmic Strings and Other Topological Defects (Cambridge University Press, Cambridge, England, 1994)
A. Drew and E. P. S. Shellard, Radiation from Global Topological Strings using Adaptive Mesh Refinement: Methodology and Massless Modes, Phys. Rev. D 105, 063517 (2019)
A. Drew and E. P. S. Shellard, Radiation from Global Topological Strings using Adaptive Mesh Refinement: Massive Modes, Phys. Rev. D 107, 043507 (2023).
A. Drew, T. Kinowski and E. P. S. Shellard, Axion String Source Modelling, arXiv:2312.07701 (2023)
D. J. Kaup, Phys. Rev. 172, 1331 (1968) (First derivation of boson stars)
T. Helfer et al., Malaise and remedy of binary boson-star initial data, Class. Quant. Grav 39, 074001 (2022) (An example of recent boson star work)

 

Algebraic geometry, combinatorics, and machine learning

Project Title Algebraic geometry, combinatorics, and machine learning
Project Listed 22 January 2024; Please make contact before 12 February if you are interested
Contact Name Fatemeh Rezaee
Contact Email fr414@cam.ac.uk
Company/Lab/Department DPMMS
Project Duration Up to 8 weeks
Project Description There are many situations in algebraic and enumerative geometry where one can partially find some closed formulae that calculate a specific number associated with algebro-geometric objects, using combinatorial methods. In other words, one can find some partial patterns, but to see a general pattern in the partially recognized patterns, we propose applying machine learning to specific and explicit examples. For this project, I am looking for a very strong student familiar with algebraic geometry and combinatorics, and some experience with machine learning.

 

Deep neural networks in physics-based inverse imaging problems

Project Title Deep neural networks in physics-based inverse imaging problems
Project Listed 23 January 2024
Contact Name Zak Shumaylov (PhD student), Carola-Bibiane Schönlieb
Contact Email zs334@cam.ac.uk
Company/Lab/Department DAMTP / Cambridge Image Analysis
Project Duration Up to 8 weeks
Project Description

Summer projects: Inverse problems are concerned with reconstructing unknown model parameters from indirect and noisy measurements and arise in practically all imaging applications. In the last decade, thanks to deep learning, processing of imaging data has undergone a paradigm shift from knowledge driven approaches, deriving imaging models from first principles, to purely data driven approaches, instead deriving models from data. Most inverse problems of interest are ill-posed and require appropriate mathematical treatment for recovering meaningful solutions and while purely data-driven learning have been able to achieve remarkable empirical success for image reconstruction, they often lack rigorous reconstruction guarantees. Of particular interest are therefore methods that operate at the interface of these paradigms and feature both a knowledge driven (mathematical modelling) and a data driven (machine learning) component.

The precise details of the project will be tailored to the student's interest and skill set, but possible topics/projects for investigation include:

* Acquisition (operator)-aware learned reconstruction [1,2]
* Combining physics informed neural networks with learned regularisation [3,4,5,]
* Structured non-convex learned regularisation [6,7]

References [1] Xia, Wenjun et al. “CT Reconstruction With PDF: Parameter-Dependent Framework for Data From Multiple Geometries and Dose Levels.” IEEE transactions on medical imaging vol. 40,11 (2021): 3065-3076. doi:10.1109/TMI.2021.3085839
[2] Adler, Jonas, and Ozan Öktem. "Solving ill-posed inverse problems using iterative deep neural networks." Inverse Problems 33.12 (2017): 124007.
[3] M. Raissi et al. "Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations." Journal of Computational Physics, https://doi.org/10.1016/j.jcp.2018.10.045.
[4] Mukherjee, S., Dittmer, S., Shumaylov, Z., Lunz, S., Öktem, O., & Schönlieb, C. B. (2020). "Learned convex regularizers for inverse problems."
[5] Berk, Aaron, et al. "Deep Proximal Gradient Method for Learned Convex Regularizers." ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023
[6] Mukherjee, S., Dittmer, S., Shumaylov, Z., Lunz, S., Öktem, O., & Schönlieb, C. B. (2020). "Learned convex regularizers for inverse problems."
[7] Shumaylov, Zakhar, et al. "Provably Convergent Data-Driven Convex-Nonconvex Regularization." arXiv:2310.05812 (2023).
Prerequisite Skills All of the projects proposed can be pursued from both programming and theory direction, however a level of programming (preferably Python) experience is required. Other than this, interest is the only thing needed!

 

Efficient Probabilistic Machine Learning using Newton's Identities

Project Title Efficient Probabilistic Machine Learning using Newton's Identities
Project Listed 26 January 2024
Contact Name Henry Moss
Contact Email hm493@cam.ac.uk
Company/Lab/Department DAMTP / Institute of Computing for Climate Science
Project Duration Up to 8 weeks
Project Description

Gaussian processes are powerful probabilistic machine learning models, with particular prominence among many scientific disciplines, like in cosmology, molecular search, and climate forecasting. One reason for their prominence is the possibility they provide of directly encoding prior scientific knowledge into otherwise purely data-driven machine learning tasks. In particular, The choice of the Gaussian processes’ kernel function controls the covariance of the learned processes, with specific choices of kernel corresponding to different physical prior knowledge, e.g. conservation laws, symmetries, smoothness, or the satisfaction of particular differential equations.

However, when constructing predictive models for systems or quantities with high-dimensional inputs, particularly in scenarios with limited data, the necessity arises to incorporate additional structural assumptions. This project aims to explore methods for efficiently encoding such structures by leveraging well-established mathematical expressions, reminiscent of the renowned Girard-Newton identity.

The selected student will become an expert in Bayesian ML ( e.g. Gaussian processes), while concurrently scouring the mathematical literature to identify relevant extensions to the Girard-Newton identity. A key part of the project will be in building open-source tooling to enable the use of the project’s outcomes by the wider climate community.

 

Majority Dynamics in Graphs

Project Title Majority Dynamics in Graphs
Project Listed 29 January 2024
Contact Name Marcelo Campos
Contact Email mc2482@cam.ac.uk
Company/Lab/Department DPMMS / Combinatorics
Project Duration Up to 8 weeks
Project Description Given any graph $G$ with colored vertices one can define the majority dynamics process as the following. At each round simultaneously every vertex will turn the color that appears the most in it's neighborhood, breaking ties at random. This process has been studied for random graphs and for initial coloring being random, but many open questions remain. We would be interested in knowing for a given instance of the process, how long does it run until it reaches an equilibrium? Does one color become dominant?
References Majority Dynamics and Aggregation of Information in Social Networks - Elchanan Mossel, Joe Neeman, Omer Tamuz
Convergence, unanimity and disagreement in majority dynamics on unimodular graphs and random graphs - Itai Benjamini, Siu-On Chan, Ryan O'Donnell, Omer Tamuz, Li-Yang Tan

 

Upper bounds on third eigenvalues of adjacency matrices

Project Title Upper bounds on third eigenvalues of adjacency matrices
Project Listed 16 February 2024
Contact Name Jan Petr (PhD student in DPMMS with Prof Bela Bollobas)
Contact Email jp895@cam.ac.uk
Company/Lab/Department DPMMS
Project Duration Up to 8 weeks
Project Description In 1989 a flawed proof of the following was published: the $i$-th eigenvalue of the adjacency matrix of a graph of $n$ vertices is at most $\lfloor \frac{n}{i} \rfloor$. While this statement is true for $i \in \{1,2\}$, it does not hold in general whenever $i>3$. The main task of the summer project would be to pursue a tight upper bound on the third eigenvalue in terms of $n$.
References Liu, Ning: Unsolved Problems in Spectral Graph Theory, arXiv 2023
Nikiforov: Extrema of graph eigenvalues, Linear Algebra and its Applications, 2015
Linz: Improved Lower Bounds on the Extrema of Eigenvalues of Graphs, Graphs and Combinatorics, 2023