2022 CMP Academic Projects

This is a list of the Academic CMP project proposals from summer 2022.

Dynamical modelling of mutation as an immune defence
Keywords: Disease dynamics, virus evolution, computational modelling
SARS-CoV-2 pandemic-scale phylogenetics
Keywords: Phylogenetics, mathematical modeling, molecular evolution, epdiemiological modeling
Discovering regions of functional importance in RNA viruses
Keywords: Computational modelling, virus evolution, bioinformatics, mathematical biology
Speech enhancement for hearing devices: learned sound representations versus deterministic transforms
Keywords: deep learning, speech enhancement, frequency domain, auditory filterbanks, speech intelligibility
Decoding the Neural Signature of Speech Perception
Keywords: Cognitive neuroscience, brain-computer interface, speech perception, audio processing
Formalising Thom encoding in Isabelle/HOL
Keywords: Mechanising mathematics, proof assistant, real algebraic geometry, computer algebra
Representing plant hydraulics and plant water stress in a dynamic global vegetation model
Keywords: Water Stress Plants ABA Stomata
Climate Repair: Ice Thickening
Keywords: Arctic, ice, geoengineering, partial differential equations, numerical modelling
Integrating microvascular biophysics with graph neural networks
Keywords: Machine learning; biophysics; blood vessels; graph neural networks; modelling
Formalisation of material in number theory/additive combinatorics using Isabelle/HOL
Keywords: number theory, additive combinatorics, proof assistants, interactive theorem proving, Isabelle/HOL
Deep learning for microscopy image reconstruction
Keywords: deconvolution, deep learning, lightsheet microscopy, image reconstruction
Bayesian machine learning and theory in cosmology and particle physics
Keywords: Bayesian Inference; Machine Learning; Cosmology;
Novel Flexible Polyhedra
Keywords: geometry, computer programming
Quantum stability of a novel theory of gravity
Keywords: modified gravity, torsion, computer algebra, effective field theory, cosmology
R&D portfolio optimisation
Keywords: R&D portfolio, biopharma, monte carlo simulation, predictive analytics
Thawing frozen mutations in an ancient transmissible cancer
Keywords: Genomics, Cancer, Evolution, Mutational Signatures
Algebraic geometry and problem-based learning
Keywords: algebraic geometry, schemes, mathematical exposition, scientific writing, mathematics education
Formalising Modular Forms and Dirichlet Series in Isabelle/HOL
Keywords: number theory, modular forms, interactive theorem proving, type theory, Isabelle/HOL
Bayesian Learning for Automation
Keywords: Bayesian learning; Neural networks; MCMC; Reinforcement Learning
AI for Coronary Artery Imaging
Keywords: Autoencoders, deep learning, python, imaging, computer vision

Dynamical modelling of mutation as an immune defence

Project Title	Dynamical modelling of mutation as an immune defence
Keywords	Disease dynamics, virus evolution, computational modelling
Contact Name	Jordan Skittrall
Contact Email	jps55@cam.ac.uk
Company/Lab/Department	Department of Pathology
Address	Division of Virology, Addenbrooke's Hospital, Cambridge, CB2 0QQ
Period of the Project	8 weeks
Work Environment	The intention is that you will work in the Division of Virology, Department of Pathology, which is on the Addenbrooke's site in Cambridge. Should it be necessary to work remotely this will be possible, but the aim would be to work in person to maximise opportunities for discussion of ideas. Hours are flexible but you should expect to work standard working weeks in total. Jordan Skittrall (Pathology) will be the day-to-day supervisor and contact. There will be the opportunity to get to know other members of the division and to join any meetings taking place during the placement period.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	One form of immune response to viral infection involves deliberately mutating the virus to make it stop working. This is seen, for example, in HIV infection, where around an order of magnitude of the genetic diversity seen in viruses can be explained by this immune response, and where the virus makes a protein to defend against the response. It is also the mechanism by which some antiviral drugs work - including one of the drugs recently licensed for use in treating COVID-19. But it is a potentially risky strategy - mutations also allow viruses to escape immune responses and develop drug resistance.
Brief Description of the Project	The aim of this project is to develop a model to help us understand better how key parameters affect the balance between mutation helping the host, and mutation helping the virus. In particular, the aim is to understand to what extent both outcomes may occur in the same population if there is variability in the parameters within the population, and to estimate how often such outcomes occur. In the project, you will develop a simplified model capturing the key aspects of the mutation process, and implement and refine that model in one (or possibly more than one) infection scenario. To understand the parameter space developed, you will need to implement basic code to describe possible outcomes. This will be sufficiently straightforward that you can learn as you go if you have not done this before, but you should have basic familiarity with at least one programming language. Funding for this project is not guaranteed, and if interested in this project you are encouraged to make contact as soon as possible, as many funding opportunities require a named student in the application and earlier contact will allow us to apply before deadlines for more funders. You will not be required to respond to any project offer before the common deadline for the placements.
References	A background to the way the mutational arms race proceeds in HIV as an example (although we may work on a problem with a simpler setup) is given in the following references: Armitage et al. "APOBEC3G-Induced Hypermutation of Human Immunodeficiency Virus Type-1 Is Typically a Discrete "All or Nothing" Phenomenon" (https://doi.org/10.1371/journal.pgen.1002550) Sadler et al. "APOBEC3G Contributes to HIV-1 Variation through Sublethal Mutagenesis" (https://doi.org/10.1128/JVI.00056-10) [You do not need to understand the detail of the experiments performed in this paper, but should understand that it gives evidence that the real-world situation may be less clear-cut than suggested in reference (1).] A introduction to a completely different way this problem can be manifested is found in the following reference, which explains how molnupiravir, now licensed as a COVID-19 treatment, works: Kabinger et al. "Mechanism of molnupiravir-induced SARS-CoV-2 mutagenesis" (https://doi.org/10.1038/s41594-021-00651-0)
Prerequisite Skills	Statistics;Dynamical Systems
Other Skills Used in the Project	Probability/Markov Chains;Data Visualization;Mathematical Biology
Programming Languages	Python;MATLAB;Mathematica

SARS-CoV-2 pandemic-scale phylogenetics

Project Title	SARS-CoV-2 pandemic-scale phylogenetics
Keywords	Phylogenetics, mathematical modeling, molecular evolution, epdiemiological modeling
Contact Name	Nicola De Maio
Contact Email	demaio@ebi.ac.uk
Company/Lab/Department	EMBL-EBI
Address	The European Bioinformatics Institute (EMBL-EBI) Wellcome Genome Campus Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
Period of the Project	8 weeks or preferentially more
Work Environment	The student will be supervised day by day by a senior postdoc in the group, interact regularly with 2 other PhD students in the group, attend weekly group meetings, and be supervised weekly by the group leader.
Project Open to	Undergraduates;Master's (Part III) students
Background Information	The COVID-19 pandemic has been accompanied by an extremely intense and widespread sequencing effort, resulting in millions of SARS-CoV-2 genomes available to researcher to help investigate the virus spread and evolution. However, existing data analysis and mathematical modeling methods struggle to deal with this mole of data.
Brief Description of the Project	The students will contribute to methods developed in our lab to efficiently and acurately investigate SARS-CoV-2 genome data, and, more generally, will work on phylogenetic methods that infer evolutionary history from DNA sequence data. The project will involve the development of efficient algorithms and mathematical models and their implementation, either in Python, C++ or Java. The project can involve the development of Markov model for phylogenetics or the use of neural networks for molecular evolution. The ideal outcome would be the development of novel mathematical models to acurately describe complex scenarios of DNA evolution in a computationally efficient manner.
References	https://www.nature.com/articles/s41588-021-00862-7 https://doi.org/10.1093/gbe/evab087 https://doi.org/10.1101/2021.03.15.435416
Prerequisite Skills	Probability/Markov Chains;Programming, preferentially in Python, but Java and C++ would also be extremely useful.
Other Skills Used in the Project	Statistics;Simulation
Programming Languages	Python;C++;Java

Discovering regions of functional importance in RNA viruses (13 February Deadline)

Project Title	Discovering regions of functional importance in RNA viruses
Keywords	Computational modelling, virus evolution, bioinformatics, mathematical biology
Contact Name	Jordan Skittrall
Contact Email	jps55@cam.ac.uk
Company/Lab/Department	Department of Pathology
Address	Division of Virology, Addenbrooke's Hospital, Cambridge, CB2 0QQ
Period of the Project	8 weeks
Work Environment	The intention is that you will work in the Division of Virology, Department of Pathology, which is on the Addenbrooke's site in Cambridge. Should it be necessary to work remotely this will be possible, but the aim would be to work in person to maximise opportunities for discussion of ideas. Hours are flexible but you should expect to work standard working weeks in total.
Project Open to	Undergraduate students only
Background Information	The genetic sequences of all organisms we seen can be viewed as the results of a natural experiment in what is capable of surviving. By examining the sequences of these organisms we can analyse the results of that natural experiment. Viruses that replicate using RNA (a large proportion of the viruses we know) give some of the smallest examples of such sequences, making their analysis computationally tractable. By detecting regions of such viruses that have to be conserved for the virus to survive, we get pointers to elements of the viral lifecycle, and to possible drug targets.
Brief Description of the Project	This project would be especially suitable for somebody who wanted to explore moving into bioinformatics, or applying mathematical knowledge to microbiology. It will be advertised to both mathematicians and biologists, and the focus for a mathematician will be on learning the skills in handling biological data and in virology required to undertake an analysis and start interpreting the results. In this project you will apply to a set of virus genomes one of our mathematical techniques for searching for regions of interest. You will need to: develop sufficient understanding of the mathematics of the analysis pipeline to understand the implications of the mathematics for interpretation of results; download and curate a set of virus sequences ready for analysis; adapt code (written in Mathematica) in order to apply it to your dataset; visualise the output (develop graphical methods of representing the results of your analysis); be able to discuss your findings in a cross-disciplinary fashion with colleagues in mathematics in virology to come to a shared interpretation of the results (in terms of regions of interest you have identified in the viral genome). There are a few options for viruses it would be possible to work with on this project, which can be discussed prior to application. The project is likely to focus on a virus capable of causing human disease for which drug treatment is sought. The background work underpinning this project stretches all the way from open problems in probability theory, through the mathematics of signal processing and bioinformatics, to wet lab molecular biology and clinical applications. It is a requirement of the funding stream we propose to seek for this project that the project be strongly microbiological in nature and the project background be in microbiology in such a way that students undertaking it will be exposed to core microbiology thinking. Funding for this project will be sought via application to an external body (and so cannot be guaranteed). The deadline for the external body's funding applications necessitates an earlier deadline of *13th February 2022* for expressions of interest, although earlier expressions of interest will make funding application more straightforward.
References	These references are listed in priority order for background reading (the first two are key), but if you have time, the progression of ideas is slightly easier to follow if they are read chronologically. There is no need to have read the references prior to sending an enquiry about the project: they are included in case you would like more information. The references describe the mathematical underpinning of the techniques used, and demonstrate some previous applications of those techniques. [1] “A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data' Julia R. Gog, Andrew M.L. Lever, Jordan P. Skittrall. PLoS ONE, 2018 13(4):e0195763. (https://doi.org/10.1371/journal.pone.0195763) [2] “A scale-free analysis of the HIV-1 genome demonstrates multiple conserved regions of structural and functional importance.' Jordan P. Skittrall, Carin K. Ingemarsdotter, Julia R. Gog, Andrew M.L. Lever. PLoS Comput Biol, 2019 15(9):e1007345. (https://doi.org/10.1371/journal.pcbi.1007345) [3] “Codon conservation in the influenza A virus genome defines RNA packaging signals.' Julia R. Gog, Emmanuel Dos Santos Afonso, Rosa M. Dalton, India Leclerq, Laurence Tiley, Debra Elton, Johann C. von Kirchbach, Nadia Naffakh, Nicolas Escriou, Paul Digard. Nucleic Acids Res, 2007 (35) 1897-1907. (https://doi.org/10.1093/nar/gkm087) [4] “Genomic analysis of codon, sequence and structural conservation with selective biochemical-structure mapping reveals highly conserved and dynamic structures in rotavirus RNAs with potential cis-acting functions.' Wilson Li, Emily Manktelow, Johann C. von Kirchbach, Julia R. Gog, Ulrich Desselberger, Andrew M. Lever. Nucleic Acids Res, 2010 (38) 7718-7735. (https://doi.org/10.1093/nar/gkq663)
Prerequisite Skills
Other Skills Used in the Project	Statistics;Database Queries;Data Visualization
Programming Languages	Mathematica, small amounts of bash (it's fine to start the project with no knowledge of these)

Speech enhancement for hearing devices: learned sound representations versus deterministic transforms

Project Title	Speech enhancement for hearing devices: learned sound representations versus deterministic transforms
Keywords	deep learning, speech enhancement, frequency domain, auditory filterbanks, speech intelligibility
Contact Name	Clément Gaultier, Tobias Goehring
Contact Email	Clement.Gaultier@mrc-cbu.cam.ac.uk
Company/Lab/Department	Deep Hearing Lab, MRC Cognition and Brain Sciences Unit, University of Cambridge
Address	Clement.Gaultier@mrc-cbu.cam.ac.uk
Period of the Project	Any 8-week period from late June to early September
Work Environment	The student will be part of the Deep Hearing Lab (1) based at the MRC Cognition and Brain Sciences Unit (2) and work with primary supervisor Dr. Clément Gaultier as well as secondary supervisor Dr. Tobias Goehring. The student will benefit from joining the Cambridge Hearing Group (3), a world-leading and vibrant research network in Cambridge. The candidate will have the opportunity to use applied mathematics for signal processing and learn basics of time-frequency sound analysis/synthesis. Remote working can be arranged if conditions make it difficult or not suitable to come to the office daily (depending on the evolving COVID situation). (1): Deep Hearing Lab: https://www.deephearinglab.com (2): MRC CBSU: https://www.mrc-cbu.cam.ac.uk (3): Cambridge Hearing Group: https://www.hearing-research.group.cam.ac.uk
Project Open to	Undergraduates;Master's (Part III) students
Background Information	This project is part of a larger multidisciplinary project studying new speech enhancement strategies with the aim to improve sound perception for people using hearing devices (hearing aids, cochlear implants) in challenging listening situations with noise and reverberation.
Brief Description of the Project	Deep learning brought substantial improvements to speech recognition, enhancement or separation systems by estimating a time-frequency mask (by a masking network) in a deterministic transformed domain for time-domain sound signals (i.e. Fourier domain). The masked representation is then transformed back to a time-domain signal (i.e. Inverse Fourier transform) to yield the enhanced sound. In recent years, new artificial neural network architectures (Recurrent Neural Networks, Attention mechanisms) along with the introduction of end-to-end learning systems provided a significant boost of performance [1]. These new techniques work in an Encoder-Mask-Decoder fashion where the mask is no longer estimated from a fixed transformed domain but from learned representations available at the encoder stage. At the decoder stage, the masked representations are then transformed back to a time-domain signal ("decoded") to obtain the enhanced sound signals. This project will explore the impact of such learned representations over deterministic transforms (auditory inspired, discrete Fourier transforms, ....) in the context of noise and reverberation compensation for hearing impaired people that use hearing aids or cochlear implants. The student will investigate the following research questions (but not limited to): 1. How does learning the encoder, decoder or both affect speech enhancement performance compared to applying deterministic auditory inspired transforms? 2. Are there any properties of learned or fixed transforms that allow complexity or dimensionality reduction with similar performance? The outcomes of the study will be systematically evaluated using objective measures and pilot listening tests and results will contribute to an ongoing research project. The student will use common high-level (Python) deep learning frameworks for speech separation/enhancement including filterbank design tools for comparing the encoder-decoder models and deterministic transforms [2].
References	[1]: Luo, Yi, and Nima Mesgarani. 'TasNet: time-domain audio separation network for real-time, single-channel speech separation.' 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018. [2]: Pariente, Manuel, et al. 'Asteroid: the pytorch-based audio source separation toolkit for researchers.' arXiv preprint arXiv:2005.04132 (2020).
Prerequisite Skills	Statistics;Algebra/Number Theory;The candidate should have basic knowledge in a programming language and be keen to learn or use Python and high-level signal processing tools. Basic knowledge of statistics and linear algebra. Interests in acoustics or speech processing is a plus.
Other Skills Used in the Project
Programming Languages	Python;MATLAB;R;C++;No Preference

Decoding the Neural Signature of Speech Perception

Project Title	Decoding the Neural Signature of Speech Perception
Keywords	Cognitive neuroscience, brain-computer interface, speech perception, audio processing
Contact Name	Tobias Goehring and/or Alexis Deighton MacIntyre
Contact Email	alexisdeighton.macintyre@mrc-cbu.cam.ac.uk
Company/Lab/Department	Deep Hearing Lab, MRC Cognition and Brain Sciences Unit, University of Cambridge
Address	alexisdeighton.macintyre@mrc-cbu.cam.ac.uk
Period of the Project	8 weeks
Work Environment	The student will work together with one primary supervisor (Alexis Deighton MacIntyre/Post Doc) but will also benefit from close cooperation with other members of a friendly and vibrant lab, as well as connections within the broader Cambridge Hearing Group community, a multidisciplinary affiliation spanning brain sciences, medicine, and engineering, and the MRC-CBU, a world-leading research centre in basic and translational cognitive neuroscience. For more information, see https://www.hearing-research.group.cam.ac.uk/ and https://www.mrc-cbu.cam.ac.uk/
Project Open to	Undergraduates; Master's (Part III) students
Background Information	Electroencephalography (EEG) is a method to record brain activity in the form of electrical signals at the scalp. Recent analytical advances allow us to infer or reconstruct aspects of subjective, auditory experiences, such as speech perception, using a listener's neural data alone. This development may hold promise for applications in brain-computer interfaces, such as smart hearing aids or cochlear implants that aim to optimise auditory perception for people with hearing loss.
Brief Description of the Project	Various techniques exist to correlate external stimuli, like acoustic recordings of speech, with EEG data. Some popular approaches include mutual information (MI) analysis and the fitting of temporal response functions (TRF) using regularised linear regression. One problem is that the resulting information and/or correlation values, though statistically robust, tend to be very small in magnitude, suggesting room for improvement. It may be that the specific choice of ground truth (e.g., engineered acoustic features) is inappropriate, or that the strength of the correspondence between stimulus and neural response varies over time, in which case a temporally dynamic approach may be preferable. Finally, data-driven models derived with machine learning (ML) may complement and/or surpass the techniques described above, which assume a simple stimulus-to-brain mapping. Using EEG data from human listeners, the student's role will entail the systematic comparison of different input acoustic features, whilst taking the form of analysis (e.g., MI, TRF) as well as temporal factors (e.g., discrete versus continuous sampling, effects of window length) into account. The objective is to determine the impact of choice of feature on the overall measure of correspondence between acoustic stimuli and brain signals. The results of this project will contribute to a bigger, ongoing project and inform optimal decoding approaches with applications in hearing research. Depending on the student's interest, there is also some opportunity to explore ML-based, non-linear methods and comparing them to more established, but constrained linear approaches.
References	Crosse, M. J., Di Liberto, G. M., Bednar, A., & Lalor, E. C. (2016). The multivariate temporal response function (mTRF) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in human neuroscience, 10, 604. Ding, N., & Simon, J. Z. (2014). Cortical entrainment to continuous speech: functional roles and interpretations. Frontiers in human neuroscience, 8, 311.
Prerequisite Skills	Statistics;Mathematical Analysis;Predictive Modelling
Other Skills Used in the Project	Machine learning, biomedical data analysis
Programming Languages	Python;MATLAB

Formalising Thom encoding in Isabelle/HOL

Project Title	Formalising Thom encoding in Isabelle/HOL
Keywords	Mechanising mathematics, proof assistant, real algebraic geometry, computer algebra
Contact Name	Wenda Li
Contact Email	wl302@cam.ac.uk
Company/Lab/Department	Department of Computer Science and Technology, University of Cambridge
Address	William Gates Building JJ Thomson Avenue Cambridge. CB3 0FD
Period of the Project	8 weeks
Work Environment	The student will mainly collaborate with me, but he/she is also welcome to chat with others in the ALEXANDRIA group (https://www.cl.cam.ac.uk/~lp15/Grants/Alexandria/). There is a possibility that the project will be remote depending on the situation of the pandemic.
Project Open to	Undergraduates;Master's (Part III) students
Background Information	Modern proof assistants allow users to interact with computers to mechanise mathematical theorems and their proofs. Here, all derivation steps will be mechanically checked, so that ambiguities and errors in normal hand-written proofs can be eliminated. Recent progress in proof assistants includes Grothendieck's Schemes in the Isabelle proof assistant [1] and some recent results from Peter Scholze being mechanised in the Lean proof assistant [2]. This project will involve doing mechanised proofs within the Isabelle proof assistant.
Brief Description of the Project	Real algebraic numbers are usually encoded as an univariate integer polynomial P and an interval (with rational end points) such that there is exactly one root of P within this interval. However, this encoding is not sufficient in the field of rationals extended with real algebraic numbers and infinitesimals, since a polynomial with infinistesimal coefficients can have roots that can be isolated with a rational interval. To addresss this problem, we may need Thom encoding to distinguish polynomial roots in a non-archimedean field as has been implemented in the Z3 SMT solver [4]. In the project, the goal is to formalise the fundamental property of Thom encoding in Isabelle/HOL (Proposition 2.28. in Bochnak, Coste and Roy [3]).
References	[1] Bordg, Anthony, Lawrence Paulson, and Wenda Li. "Simple Type Theory is not too Simple: Grothendieck's Schemes without Dependent Types." arXiv preprint arXiv:2104.09366(2021). [2] Castelvecchi, Davide. "Mathematicians welcome computer-assisted proof in'grand unification'theory." Nature (2021). [3] Bochnak, J., Coste, M. and Roy, M. F. (2013). Real algebraic geometry (Vol. 36). Springer. [4] Grant Passmore and Leonardo de Moura {2013}. Computation in Real Closed Infinitesimal and Transcendental Extensions of the Rationals. Proceedings in the 24th International Conference on Automated Deduction (CADE-24).
Prerequisite Skills	Mathematical Analysis;Prior knowledge of proof assistants (e.g., Coq, Lean, Isabelle) is preferred but not required
Other Skills Used in the Project
Programming Languages	Isabelle

Representing plant hydraulics and plant water stress in a dynamic global vegetation model

Project Title	Representing plant hydraulics and plant water stress in a dynamic global vegetation model
Keywords	Water Stress Plants ABA Stomata
Contact Name	Prof. Andrew D. Friend
Contact Email	adf10@cam.ac.uk
Company/Lab/Department	Department of Geography
Address	Department of Geography, University of Cambridge, Downing Place, Cambridge CB2 3EN
Period of the Project	8 weeks summer 2022
Work Environment	The project is set up to complement current work in our research group and give the student a realistic research experience, so there is some room to customize the project based on the student's interests and skills. The project will be performed using MATLAB (or similar programming language) for the data analysis and FORTRAN for the implementation of sub models in HYBRID. Model runs will be carried out on the Cambridge CSD3 cluster. The student will work closely with a PhD student.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	In a changing climate the global hydrological cycle will alter significantly (Abbott et al., 2019) and drought conditions will become more frequent and intermittent (Greve et al., 2019; Pokhrel et al., 2021; Samaniego et al., 2018). Under these conditions it is critical for plants to manage their water economy efficiently. Modeling plant water stress is a key component of dynamic global vegetation models (Bonan et al., 2014; Eller et al., 2020; Kennedy et al., 2019) as it is necessary to correctly predict plant productivity and mortality during periods of drought (Stocker et al., 2019; Trugman et al., 2018).
Brief Description of the Project	In this project we would like you to work on implementing alternative stomatal conductance models and water stress functions into the HYBRID vegetation model, as well as further representing plant signaling pathways that lead to stomatal closure, regulating water loss. One signaling pathway that could be included is abscisic acid (ABA) concentration, which is a well-known driver of stomatal closure. This would include creating a sub-model for ABA synthesis, transport, and sequestration, as well as the changing sensitivity of stomatal conductance to ABA concentration (could be based on the work of Dewar, 2002). Furthermore, resulting plant water fluxes during drought can be compared and analyzed in respect to microwave observations of vegetation water content.
References	Abbott, B.W., Bishop, K., Zarnetske, J.P., Minaudo, C., Chapin, F.S., Krause, S., Hannah, D.M., Conner, L., Ellison, D., Godsey, S.E., Plont, S., Marçais, J., Kolbe, T., Huebner, A., Frei, R.J., Hampton, T., Gu, S., Buhman, M., Sara Sayedi, S., Ursache, O., Chapin, M., Henderson, K.D., Pinay, G., 2019. Human domination of the global water cycle absent from depictions and perceptions. Nat. Geosci. 12, 533-540. https://doi.org/10.1038/s41561-019-0374-y Bonan, G.B., Williams, M., Fisher, R.A., Oleson, K.W., 2014. Modeling stomatal conductance in the earth system: linking leaf water-use efficiency and water transport along the soil-plant-atmosphere continuum. Geosci. Model Dev. 7, 2193-“2222. https://doi.org/10.5194/gmd-7-2193-2014 Dewar, R.C., 2002. The Ball-Berry-Leuning and Tardieu-Davies stomatal models: synthesis and extension within a spatially aggregated picture of guard cell function. Plant. Cell Environ. 25, 1383-1398. https://doi.org/10.1046/j.1365-3040.2002.00909.x Eller, C.B., Rowland, L., Mencuccini, M., Rosas, T., Williams, K., Harper, A., Medlyn, B.E., Wagner, Y., Klein, T., Teodoro, G.S., Oliveira, R.S., Matos, I.S., Rosado, B.H.P., Fuchs, K., Wohlfahrt, G., Montagnani, L., Meir, P., Sitch, S., Cox, P.M., 2020. Stomatal optimization based on xylem hydraulics (SOX) improves land surface model simulation of vegetation responses to climate. New Phytol. 226, 1622-1637. https://doi.org/10.1111/nph.16419 Greve, P., Roderick, M.L., Ukkola, A.M., Wada, Y., 2019. The aridity Index under global warming. Environ. Res. Lett. 14, 124006. https://doi.org/10.1088/1748-9326/ab5046 Kennedy, D., Swenson, S., Oleson, K.W., Lawrence, D.M., Fisher, R., Lola da Costa, A.C., Gentine, P., 2019. Implementing Plant Hydraulics in the Community Land Model, Version 5. J. Adv. Model. Earth Syst. 11, 485-513. https://doi.org/10.1029/2018MS001500 Pokhrel, Y., Felfelani, F., Satoh, Y., Boulange, J., Burek, P., Gädeke, A., Gerten, D., Gosling, S.N., Grillakis, M., Gudmundsson, L., Hanasaki, N., Kim, H., Koutroulis, A., Liu, J., Papadimitriou, L., Schewe, J., Müller Schmied, H., Stacke, T., Telteu, C.-E., Thiery, W., Veldkamp, T., Zhao, F., Wada, Y., 2021. Global terrestrial water storage and drought severity under climate change. Nat. Clim. Chang. https://doi.org/10.1038/s41558-020-00972-w Samaniego, L., Thober, S., Kumar, R., Wanders, N., Rakovec, O., Pan, M., Zink, M., Sheffield, J., Wood, E.F., Marx, A., 2018. Anthropogenic warming exacerbates European soil moisture droughts. Nat. Clim. Chang. 8, 421-426. https://doi.org/10.1038/s41558-018-0138-5 Stocker, B.D., Zscheischler, J., Keenan, T.F., Prentice, I.C., Seneviratne, S.I., Peñuelas, J., 2019. Drought impacts on terrestrial primary production underestimated by satellite monitoring. Nat. Geosci. 12, 264-270. https://doi.org/10.1038/s41561-019-0318-6 Trugman, A.T., Medvigy, D., Mankin, J.S., Anderegg, W.R.L., 2018. Soil Moisture Stress as a Major Driver of Carbon Cycle Uncertainty. Geophys. Res. Lett. 45, 6495-6503. https://doi.org/10.1029/2018GL078131
Prerequisite Skills	Predictive Modelling
Other Skills Used in the Project
Programming Languages	MATLAB; Fortran

Climate Repair: Ice Thickening

Project Title	Climate Repair: Ice Thickening
Keywords	Arctic, ice, geoengineering, partial differential equations, numerical modelling
Contact Name	Katie Parker (Project Manager), Professor Hugh Hunt (Supervisor)
Contact Email	kvp24@dow.cam.ac.uk
Company/Lab/Department	Centre for Climate Repair / Engineering
Address	Centre for Climate Repair, Downing College
Period of the Project	8-10 weeks at a date to be agreed between June and Sept
Work Environment	The project can be undertaken remotely or in person, subject to discussion at interview. The team typically work office hours (9-5) but are flexible.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	The Arctic is melting fast. The Centre for Climate Repair (CCRC) is looking at technologies that might slow down or reverse this melting. One idea is to spray seawater onto existing ice during the cold winter thereby thickening it so that it will last through the Arctic summer. A fourth-year engineering student has developed ice-thickening experiments and has made interesting measurements with water flowing in a channel inside a freezer at -18°C, but there is a need for a mathematical model to enable the measurements to be properly interpreted.
Brief Description of the Project	This project is ideally suited to an applied mathematician or engineer who is comfortable with partial differential equations and numerical methods. The balance between heat transfer, sensible heat and latent heat in flowing water is formulated in terms of distance and time. As if this isn't complicated enough, salt water as it freezes develops a salt-concentration gradient which needs to be included. If we have a good model, then we can use it to design methodologies for creating ice in the Arctic.
References	Peter Wadhams 'A Farewell to Ice''; Centre for Climate Repair website, especially working papers - https://www.climaterepair.cam.ac.uk/working-papers; a couple of papers from last year's internships available on request from kvp24@dow.cam.ac.uk
Prerequisite Skills	Numerical Analysis; Mathematical Analysis; Predictive Modelling
Other Skills Used in the Project
Programming Languages	Python; MATLAB

Integrating microvascular biophysics with graph neural networks

Project Title	Integrating microvascular biophysics with graph neural networks
Keywords	Machine learning; biophysics; blood vessels; graph neural networks; modelling
Contact Name	Dr Paul Sweeney
Contact Email	Paul.sweeney@cruk.cam.ac.uk
Company/Lab/Department	Bohndiek Lab, Cancer Research UK Cambridge Institute, University of Cambridge
Address	Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE
Period of the Project	8 weeks (flexible)
Work Environment	Lab based - hours typically 10am - 4pm. Lab consists of several post-docs and PhD students. Can flexibly work from home at times.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	Many real-world datasets can be define in the form of a graph, for example, social networks, protein interactions and road networks. An active area of interest in machine learning are graph neural networks (GNNs) which operate on graph data. Subsequent application of GNNs has led to progress in fake news detection, antibacterial discovery and traffic detection. Biomedical imaging can extract information on the structure of large microvascular networks (10^6 vessels) which can be used as input into mathematical models to investigate biological transport phenomena. These networks of blood vessels can also be represented graphs and so are amenable to GNNs. Due to the ever increasing size of these datasets, as a result of advances in imaging, it would be interesting to see if GNNs can rapidly generate accurate predictions relating to the inherent structure of these networks, in addition to predicting more complex biophysics.
Brief Description of the Project	The aim of the project will be for the student to build their own graph neural network in Python (using standard APIs e.g., Tensorflow / Keras) to form useful predictions relating to properties of vascular networks. Students will generate their own synthetic vascular graphs using existing software, to build a library of synthetic data, as well as utilise existing blood vessel structural datasets obtained via biomedical imaging. These data could be used to model blood flow using existing packages (C++, no experience needed) as well act as input to their GNN. Initially the student will develop their GNN for undirected graphs to predict basic properties of the graph (e.g., path of least resistance). Next the GNN will be developed to incorporate directional graphs to be trained against their blood flow simulations, enabling their GNN to predict biophysical properties for any arbitrary graph used as input. A successful project is deemed as one where the student gains competence and confidence in coding, model design and application. The project is open-ended, evolving as news ideas arise and dependent on student progress.
References	'A gentle introduction to graph neural networks' https://distill.pub/2021/gnn-intr o/ 'A practical tutorial on graph neural networks' https://arxiv.org/pdf/2010.05234.pdf
Prerequisite Skills	Simulation; Predictive Modelling; A basic understanding of machine learning & coding experience
Other Skills Used in the Project	Statistics; Mathematical physics; PDE's
Programming Languages	Python; C++

Formalisation of material in number theory/additive combinatorics using Isabelle/HOL

Project Title	Formalisation of material in number theory/additive combinatorics using Isabelle/HOL
Keywords	number theory, additive combinatorics, proof assistants, interactive theorem proving, Isabelle/HOL
Contact Name	Dr. Angeliki Koutsoukou-Argyraki
Contact Email	ak2110@cam.ac.uk
Company/Lab/Department	University of Cambridge, Department of Computer Science and Technology (Computer Laboratory)
Address	15 JJ Thomson Avenue CB30FD Cambridge
Period of the Project	8 weeks
Work Environment	ALEXANDRIA group, please see https://www.cl.cam.ac.uk/~lp15/Grants/Alexandria/. We will be working remotely.
Project Open to	Undergraduates; Master's (Part III) students
Background Information
Brief Description of the Project	The student will participate in a project involving the formalisation of material in number theory/additive combinatorics using the proof assistant (interactive theorem prover) Isabelle/HOL.
References	Recent related work: https://www.isa-afp.org/entries/Roth_Arithmetic_Progressions.html https://www.isa-afp.org/entries/Szemeredi_Regularity.html Index of the Archive of Formal Proofs: https://www.isa-afp.org/topics.html Isabelle: https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html
Prerequisite Skills	Mathematical Analysis; Algebra/Number Theory
Other Skills Used in the Project	Previous experience in Isabelle/HOL or other proof assistants is desirable but not necessary. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html
Programming Languages	Isabelle/HOL. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html

Deep learning for microscopy image reconstruction

Project Title	Deep learning for microscopy image reconstruction
Keywords	deconvolution, deep learning, lightsheet microscopy, image reconstruction
Contact Name	Leila Muresan
Contact Email	lam94@cam.ac.uk
Company/Lab/Department	Dept. of Physiology, Development and Neuroscience
Address	Anatomy building, CB2 3DY
Period of the Project	8 weeks
Work Environment	The student will be part of an on-going collaboration between , MRC-LMB (Jerome Boulanger), DAMTP (Yury Korolev) and Quantitative Biology Institute, Yale University (Bogdan Toader) and PDN (Leila Muresan). There will be weekly meetings of the entire team (possibly online), otherwise the schedule and location arrangements are flexible.
Project Open to	Master's (Part III) students
Background Information	In recent years. machine learning and especially deep-learning techniques had a huge impact on microscopy image analysis. For instance, the solutions of difficult segmentation tasks were hugely improved for both 2d and 3d data, and several image reconstruction and denoising methods have been designed that currently constitute the state of the art. Deep learning has also been used to model the data for single molecule localization microscopy providing insightful forward models.
Brief Description of the Project	This project will focus on combining learned regularization and forward models for solving deconvolution problems in a mathematically coherent manner. The goal is to exploit the flexibility of learned models and the high performance of learned regularizers over traditional regularization methods while maintaining mathematical guarantees for the reconstructed images. (The goal of the project is open-ended, the candidate will contribute to ongoing work on lightsheet microscopy deconvolution).
References
Prerequisite Skills
Other Skills Used in the Project	Numerical Analysis; Image processing; Simulation
Programming Languages	No Preference

Bayesian machine learning and theory in cosmology and particle physics

Project Title	Bayesian machine learning and theory in cosmology and particle physics
Keywords	Bayesian Inference; Machine Learning; Cosmology;
Contact Name	Dr Will Handley
Contact Email	wh260@cam.ac.uk
Company/Lab/Department	Kavli Institute for Cosmology/Cavendish Laboratory
Address	Kavli Institute for Cosmology/Cavendish Laboratory
Period of the Project	8-12 weeks (depending on funding)
Work Environment	Working with my postdocs and PhD students in the KICC
Project Open to	Undergraduates; Master's (Part III) students
Background Information
Brief Description of the Project	In this project the student will work with Dr Handley and his team investigating the development and application of Bayesian machine learning techniques to modern and future cosmological and particle physics datasets. The precise details of the project will be tailored to the student interest and skill set, but possible topics/projects include 1. Developing machine learning algorithms for nested sampling and applying these to cosmological data sets - https://arxiv.org/abs/1506.00171 - https://arxiv.org/abs/2007.08496 2. Model independent reconstruction of the primordial universe from cosmic microwave background data and cosmic dawn data - https://arxiv.org/abs/1908.00906 3. Developing and applying mathematical schemes for disentangling physical signatures in the primordial universe - https://arxiv.org/abs/1907.08524 - https://arxiv.org/abs/2009.05573 4. Combining particle physics and cosmological data as part of the GAMBIT team - https://gambit.hepforge.org/ - https://arxiv.org/abs/2009.03286 - https://arxiv.org/abs/2009.03287 5. Investigating quantum initial conditions for inflation - https://arxiv.org/abs/2112.07547 - https://arxiv.org/abs/1607.04148 Over the course of the project students can expect to learn some/all of: up-to-date cosmological research questions Science grade python High performance computing Bayesian inference Machine learning Computer algebra Essential: Three years of undergraduate physics, mathematics or equivalent Basic to intermediate Python experience and good programming skills Strong mathematical skills Desirable: Interest/Knowledge of general relativity/cosmology Experience using Mathematica/Maple/Computer algebra
References
Prerequisite Skills
Other Skills Used in the Project
Programming Languages	Python; Mathematica or Maple

Novel Flexible Polyhedra

Project Title	Novel Flexible Polyhedra
Keywords	geometry, computer programming
Contact Name	Simon Guest
Contact Email	sdg@eng.cam.ac.uk
Company/Lab/Department	Engineering Department
Address	sdg@eng.cam.ac.uk
Period of the Project	8 weeks
Work Environment	The student will be part of a small group in the Civil Engineering Building in West Cambridge.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	Cauchy showed that all convex polyhedra are rigid, but wasn't until the 1970s that Bob Connelly found a non-convex polyhedron that was flexible. A related result is Alexandrov's uniqueness theorem, that shows that any polyhedron with a given metric (i.e. that can be folded from a net) has a unique convex realisation - a constructive proof of this was only found by Bobenko and Izmestiev in 2008. These results are closely connected to recent work in rigid origami, and a better understanding of the connection between them might allow us to develop novel foldable structures, for instance for use in spacecraft.
Brief Description of the Project	The project will start by developing a computer implementation of Bobenko and Izmestiev's algorithm for finding convex realisations for polyhedra, and then examine whether this algorithm can be modified to understand better the behaviour of (clearly non-convex) flexible polyhedra. This might give us insight to develop new flexible polyhedra, for instance novel forms that are not fully triangulated.
References	Much of the background is given in the recent book 'Frameworks, Tensegrities, and Symmetry' published by CUP, and available electronically from the University Library.
Prerequisite Skills	Geometry/Topology
Other Skills Used in the Project
Programming Languages	Python; MATLAB

Quantum stability of a novel theory of gravity

Project Title	Quantum stability of a novel theory of gravity
Keywords	modified gravity, torsion, computer algebra, effective field theory, cosmology
Contact Name	Will Barker
Contact Email	wb263@cam.ac.uk
Company/Lab/Department	Cavendish astrophysics & KICC
Address	Office K34, Kavli Institute for Cosmology, Cambridge, CB3 0HA
Period of the Project	8 weeks
Work Environment	The student may work remotely, but would ideally have a desk at the Kavli Institute for Cosmology, Cambridge (KICC). We have a small modified gravity nexus belonging to the Cavendish Astrophysics Group, comprising Professors Lasenby and Hobson and myself. The broader environment in the Kavli is dominated by theoretical, observational and statistical cosmologists. The student would be encouraged to take advantage of seminars and networking both at the Kavli and the CMS. There are also opportunities to liaise remotely with astroparticle theorists at CEICO in Prague and at the Instituut Lorentz in Leiden. We have free coffee.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	Einstein's General Relativity (GR) remains the preferred effective theory of gravity as spacetime curvature, explaining the orbital precession of Mercury and solar bending of starlight while underpinning modern cosmology. However GR does not explain dark matter or dark energy, while an alleged `Hubble tension' indicates that our Universe is expanding 10% faster than it should be. And of course, GR continues to stubbornly resist attempts at (complete) quantum reformulation. We recently attracted some attention by proposing an alternative theory of gravity as a blend of spacetime torsion and curvature: this appears to provide a cosmological constant and alleviate the Hubble tension. Our Lagrangian is wildly different from that of GR, with a quantum structure suggestive of renormalisability. We believe the theory adopts a torsion vacuum expectation value (VEV) at a primordial epoch, on the back of which the good classical phenomena emerge. However many quantum/classical aspects of this torsion VEV, and the violent early-Universe physics of its formation, remain shrouded in mystery...
Brief Description of the Project	The findings of the project may debunk our theory or, if we are lucky, propel it further into the spotlight of community interest. The student may wish to target one of several new fronts we are opening in our research campaign: 1) *Quantum stability and the infrared* -- This is an urgent question; we don't really know if the torsion VEV is stable against quantum fluctuations, as is the case for the Minkowski vacuum of GR. The student will apply well established effective field theory and ghost condensate techniques to characterise the infrared environment of the VEV. Extensions to the ultraviolet are of course welcome depending on expertise, though expected to be more challenging. A stable vacuum is quite a big deal, while convincing instabilities would seem to rule our theory out: either way this avenue promises high returns. 2) *Cosmological perturbation theory* -- There is a very well established theory dictating how cosmic density perturbations evolve under gravity, which supports GR to amazing precision based on tiny anisotropies in the cosmic microwave background and the clustering of matter on the grandest scales. The student will extract and characterise the classical perturbation equations (and perhaps convenient gauges) around the torsion VEV, matching against GR where possible. This also targets aspects of the infrared environment, but uses classical methods so does not require prior knowledge of QFT. Apart from offering a neat standalone stability test the perturbation theory will facilitate, in the long run (2023), sophisticated Monte Carlo tests against cosmological survey data: to this end a successful student would also have a stake in these future research way points. 3) *Primordial symmetry breaking* -- We imagine that the Big Bang left our gravity theory in a torsionless conformal phase, bathed in the standard model plasma. So how and when does the torsion VEV form in relation to the condensation of the Higgs field? Could this process have driven inflation, the violent expansion thought to have occurred in the very early Universe? A decaying deviation from the torsion VEV just after inflation can alleviate the Hubble tension: what physics sets this initial condition? The student may wish to merely explore these questions using the background cosmology equations, and there is a viable research-grade project at this level. However depending on interest/experience in electroweak symmetry breaking or effective quantum theories of inflation, we may hope to propose a novel inflationary mechanism. These topics are not exhaustive and are subject to shift as we study the theory throughout spring 2022.
References	For the history of our new theory, skim chapters 2-4/refs therein: https://wevbarker.com/assets/papers/Barker_PhDThesis.pdf For an intro to spacetime torsion, skim the first three chapters: http://alpha.sinp.msu.ru/~panov/LibBooks/GRAV/Blagojevic_M.-Gravitation_and_gauge_symmetries(2002).pdf For ghost condensates and the infrared analysis: https://arxiv.org/abs/hep-th/0312099 For more infrared techniques that will apply near the VEV: https://arxiv.org/abs/hep-th/9210046 The excellent Les Houches notes on effective field theories: https://arxiv.org/abs/1804.05863 For effective field theories of inflation: https://physics.mcmaster.ca/~cburgess/Notes/InflationEFTs.pdf For an intro to cosmological perturbations: https://arxiv.org/abs/hep-th/0306071
Prerequisite Skills	Mathematical physics; PDEs; familiarity with GR and (very) introductory cosmology
Other Skills Used in the Project	Data Visualization; QFT and cosmological perturbation theory are a bonus according to chosen topic.
Programming Languages	Python; Mathematica (you can get a free license from Maths dept.!), maybe Maple if you prefer it.

R&D portfolio optimisation

Project Title	R&D portfolio optimisation
Keywords	R&D portfolio, biopharma, monte carlo simulation, predictive analytics
Contact Name	Nektarios Oraiopoulos
Contact Email	no245@cam.ac.uk
Company/Lab/Department	Judge Business School
Address	Trumpington Street, CB2 1AG
Period of the Project	8-10 weeks
Work Environment	Student will work mostly on their own and will have regular meetings with academic supervisor. Working remotely is fine. Would be good to have some meetings face to face.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	Managing the R&D pipeline in a biopharmaceutical company is one of the most significant challenges in the industry. Decision have to be made regarding what projects should be advanced in the next stage (and therefore consume significant financial resources of the company) and that projects should be put on-hold or terminated. Those decisions are made under significant uncertainty: each project has a likelihood of success, and specifically estimates rates of false positive (the current data look promising, but actually the project is doomed to fail) or false negatives (the current data look weak, but actually the project will work, if given resources).
Brief Description of the Project	The student will work closely with Dr Oraiopoulos (https://www.jbs.cam.ac.uk/faculty-research/faculty-a-z/nektarios-oraiopo...) to develop an R&D portfolio optimisation model. The optimisation model will have as inputs estimates regarding the cost of each project, the false positive and negative rates, the commercial potential, etc. and it will calculate (and visualize) the expected reward and risk of different portfolios (allowing the decision-maker to select the most promising one). E.g., the model might suggest that the decision-maker should select only 6 out of the 10 current projects. A key characteristic of the model should be that it compares portfolios of projects rather than single projects. The student might also be given access to large datasets that would allow her/him to estimate those false positive/negative rates using predictive models. The student will also receive feedback from experienced executives from the pharmaceutical industry that have created drugs that transformed the industry.
References	https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0... https://www.nature.com/articles/nrd3078?report=reader
Prerequisite Skills	Statistics; Probability/Markov Chains; Simulation; Predictive Modelling; Data Visualization
Other Skills Used in the Project
Programming Languages	Python; MATLAB; R

Thawing frozen mutations in an ancient transmissible cancer

Project Title	Thawing frozen mutations in an ancient transmissible cancer
Keywords	Genomics, Cancer, Evolution, Mutational Signatures
Contact Name	Kevin Gori
Contact Email	kcg25@cam.ac.uk
Company/Lab/Department	Department of Veterinary Medicine
Address	Department of Veterinary Medicine, Madingley Road, Cambridge, CB3 0ES
Period of the Project	20 June - 19 August (some flexibility)
Work Environment	The student will work as part of the Transmissible Cancer group, based at the Department of Veterinary Medicine. Preferably they will spend at least three days a week in the lab.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	The project will focus on studying transmissible cancer, which is a rare class of cancer that has developed the ability to infect new hosts. The cancer in question is Canine Transmissible Venereal Tumour, which affects dogs worldwide. The project aims to use genomic sequences of CTVT to examine early events that took place in its many-centuries long evolution, that may help to explain how it arose in the first place.
Brief Description of the Project	This project will be based at the Department of Veterinary Medicine, and will involve genomic analysis of the canine transmissible venereal tumour (CTVT). CTVT is a transmissible cancer, which is a rare class of cancer that has developed the ability to infect new hosts, and behaves rather like a parasitic organism. CTVT is spread among dogs when they come in direct physical contact with tumour tissue infecting another individual, usually during mating. CTVT is by far the oldest clonally reproducing cancer known on Earth. The project will build on recent work done in our lab to unravel the earliest events that befell the tumour in its progression towards becoming infectious and globally endemic. Using high coverage DNA sequencing information from several CTVT samples, as well as from the uninfected tissue of their hosts, we have previously estimated the evolutionary tree that relates these tumours. This work has identified a previously unseen mutational signature (‘signature A’) that was active during the early evolution of CTVT, but was later switched off. In this project we will take estimates of genomic copy number in our samples, and use these to find genomic regions that have been duplicated in CTVT’s development. From these duplications we can identify ‘frozen time points’: fragments of DNA sequence that were gained during the early evolution of the cancer. The relative ages of these fragments will be estimated by the degree to which they have accumulated new mutations. Combined with this timing information, by examining the fragments for the presence of signature A we will be able to determine whether signature A occurred continuously, or in bursts. Additionally, the earliest fragments will be highly representative of the genotype of the animal in which CTVT arose, illuminating perhaps the characteristics of the earliest domesticated dogs.
References	Strakova, Andrea, and Elizabeth P. Murchison. 2015. “The Cancer Which Survived: Insights from the Genome of an 11000 Year-Old Cancer.” Current Opinion in Genetics & Development 30: 49–55. Baez-Ortega, Adrian, Kevin Gori, Andrea Strakova, Janice L. Allen, Karen M. Allum, Leontine Bansse-Issa, Thinlay N. Bhutia, et al. 2019. “Somatic Evolution and Global Expansion of an Ancient Transmissible Cancer Lineage.” Science 365 (6452): eaau9923. Leathlobhair, Máire Ní, Angela R. Perri, Evan K. Irving-Pease, Kelsey E. Witt, Anna Linderholm, James Haile, Ophelie Lebrasseur, et al. 2018. “The Evolutionary History of Dogs in the Americas.” Science 361 (6397): 81–85.
Prerequisite Skills	Statistics; Data Visualization
Other Skills Used in the Project	Probability/Markov Chains; Simulation
Programming Languages	Python; R

Algebraic geometry and problem-based learning

Project Title	Algebraic geometry and problem-based learning
Keywords	algebraic geometry, schemes, mathematical exposition, scientific writing, mathematics education
Contact Name	Anthony Bordg
Contact Email	apdb3@cam.ac.uk
Company/Lab/Department	Department of Computer Science and Technology
Address	William Gates Building, JJ Thomson Avenue, Cambridge CB3 0FD
Period of the Project	4 weeks
Work Environment	The student will work with Anthony Bordg. Remote work possible.
Project Open to	Undergraduates; Master's (Part III) students
Background Information
Brief Description of the Project	This project will deal with issues in teaching and learning a very abstract field of mathematics: algebraic geometry. The point of view of a student is desirable and will be valued. Together with Anthony Bordg the student will work on completing an exposition of schemes in algebraic geometry that intends to fill a wide gap in the literature between nontechnical presentations and advanced textbooks. The final goal is the publication of an expository article on the topic in a mathematics journal, e.g. Emergent Scientist, Rocky Mountain Journal of Mathematics ... The student will be given extensive guidance and training in mathematical writing.
References	Anthony Bordg, "What is a Scheme in Algebraic Geometry? A Problem-Oriented Approach", https://drive.google.com/file/d/19hsesOZl70hmzYxcV_OgINOgIg2SBD1q/view
Prerequisite Skills	Geometry/Topology; Algebra/Number Theory; algebraic geometry
Other Skills Used in the Project
Programming Languages

Formalising Modular Forms and Dirichlet Series in Isabelle/HOL

Project Title	Formalising Modular Forms and Dirichlet Series in Isabelle/HOL
Keywords	number theory, modular forms, interactive theorem proving, type theory, Isabelle/HOL
Contact Name	Anthony Bordg
Contact Email	apdb3@cam.ac.uk
Company/Lab/Department	Department of Computer Science and Technology
Address	William Gates Building, JJ Thomson Avenue, Cambridge CB3 0FD
Period of the Project	8 weeks
Work Environment	Your supervisor will be Anthony Bordg, but you will interact with the whole ALEXANDRIA team led by Prof. Larry Paulson.
Project Open to	Undergraduates; Master's (Part III) students
Background Information
Brief Description of the Project	You will work on a formalisation in Isabelle/HOL of Apostol's textbook "Modular Functions and Dirichlet Series in Number Theory". The main definitions and statements have already been formalised (see the GitHub repo in References), hence you will focus on proving these statements with the help of Isabelle/HOL efficient automation. This could lead to a pioneering and high-impact work in the fast-growing field of the formalisation of mathematics.
References	- Modular Functions and Dirichlet Series in Isabelle/HOL, https://github.com/AnthonyBordg/Number_Theory (GitHub repo) - Tom Apostol, Modular Functions and Dirichlet Series in Number Theory, Springer - Isabelle Zulip chat: https://isabelle.zulipchat.com
Prerequisite Skills	Mathematical Analysis; Algebra/Number Theory; complex analysis
Other Skills Used in the Project	Previous experience with Isabelle/HOL or any other proof assistant (Coq, Lean ...) is desirable but not necessary. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html
Programming Languages	Isabelle/HOL

Bayesian Learning for Automation

Project Title	Bayesian Learning for Automation
Keywords	Bayesian learning; Neural networks; MCMC; Reinforcement Learning
Contact Name	Sumeetpal S. Singh
Contact Email	sss40@cam.ac.uk
Company/Lab/Department	Engineering
Address	Department of Engineering, Trumpington Street, CB21PZ
Period of the Project	10-12 weeks (late June/early July start))
Work Environment	Join a team involving 2 other PhD students working on related topics. Based at the Department of Engineering with a mixture of remote and in-person presence. Periodic updates and discussions with sponsors Mathworks.
Project Open to	Undergraduates; Master's (Part III) students
Background Information	One very promising technique for automation is to gather data form an expert demonstration and then learn the expert's policy using Bayesian inference. The learnt policy is then extrapolated to automate the task in novel settings. The potential applications of this approach are numerous, e.g. automated navigation. The key challenges of this technique of ``control by mimicry'' are: 1. Learning the expert's policy, a function, and accurately representing uncertainty. 2. To improve knowledge of the expert's policy, more data needs to be gathered. This should be done sparingly, as data gathering can be expensive, and also be guided by an optimality criterion, such as an Information theoretic criterion. Impact on academic area, on user community, industry, and beyond: This is an exciting and novel endeavour that exploits recent advances in Statistics for challenging automation problems. [1] Simon L Cotter, Gareth O Roberts, Andrew M Stuart, and David White. MCMC methods for functions: modifying old algorithms to make them faster. Statistical Science, pages 424–446, 2013. [2] T. Sell and S.S. Singh. Trace-class Gaussian priors for Bayesian learning of neural networks with MCMC. Under review. Arxiv E-print arXiv:2012.10943. [3] https://gym.openai.com/
Brief Description of the Project	Deliverables: (i) A generic Matlab (potentially in collaboration with sponsor Mathworks) and Python implementation of our MCMC sampling algorithm for trace-class neural network priors [2] which can also be used more widely for other applications of Bayesian neural networks. (ii) Proof-of-concept automation implementations on exemplar tasks from AI Gym. (iii) Potential use of our Bayesian neural network sampling algorithm within the CUED curriculum (MEng level). Currently Bayesian neural networks are not taught, nor are they widely experimented with in the MEng projects. (iv) An outreach activity for years 5/6 school children in the field of Reinforcement learning/Automation via a Microbit implementation. ALL CODE WILL BE MADE PUBLIC.
References	[1] T. Sell and S.S. Singh. Trace-class Gaussian priors for Bayesian learning of neural networks with MCMC. Under review. Arxiv E-print arXiv:2012.10943. [2] Simon L Cotter, Gareth O Roberts, Andrew M Stuart, and David White. MCMC methods for functions: modifying old algorithms to make them faster. Statistical Science, pages 424–446, 2013. [3] https://gym.openai.com/
Prerequisite Skills	Statistics; Probability/Markov Chains
Other Skills Used in the Project	Numerical Analysis ;PDE's; Mathematical Analysis
Programming Languages	Python; MATLAB; Delieverable in MATLAB as required by sponsor.

AI for Coronary Artery Imaging

Project Title	AI for Coronary Artery Imaging
Keywords	Autoencoders, deep learning, python, imaging, computer vision
Contact Name	Mike Roberts
Contact Email	mr808@cam.ac.uk
Company/Lab/Department	DAMTP / Cardiology
Address	Department of Medicine / DAMTP
Period of the Project	8 weeks +
Work Environment	In DAMTP or the Department of Medicine
Project Open to	Undergraduates; Master's (Part III) students
Background Information	Coronary arteries can be imaged using Optical Coherence Tomography imaging and show features of the artery from the inside allowing for identification of disease. These images are extremely high dimensional making many deep learning methods intractable.
Brief Description of the Project	We will apply deep learning methods for image compression, encoding and reconstruction to encode high dimensional images to low dimensional representations. This then allows for downstream identification of diseased tissue, quantification and prediction of outcomes.
References
Prerequisite Skills	Image Processing
Other Skills Used in the Project
Programming Languages	Python

Dynamical modelling of mutation as an immune defence

SARS-CoV-2 pandemic-scale phylogenetics

Discovering regions of functional importance in RNA viruses (13 February Deadline)

Speech enhancement for hearing devices: learned sound representations versus deterministic transforms

Decoding the Neural Signature of Speech Perception

Formalising Thom encoding in Isabelle/HOL

Representing plant hydraulics and plant water stress in a dynamic global vegetation model

Climate Repair: Ice Thickening

Integrating microvascular biophysics with graph neural networks

Formalisation of material in number theory/additive combinatorics using Isabelle/HOL

Deep learning for microscopy image reconstruction

Bayesian machine learning and theory in cosmology and particle physics

Novel Flexible Polyhedra

Quantum stability of a novel theory of gravity

R&D portfolio optimisation

Thawing frozen mutations in an ancient transmissible cancer

Algebraic geometry and problem-based learning

Formalising Modular Forms and Dirichlet Series in Isabelle/HOL

Bayesian Learning for Automation

AI for Coronary Artery Imaging

Forthcoming Seminars

News, Announcements and Events

Social media

Study at Cambridge

About the University

Research at Cambridge