skip to content

Summer Research Programmes

 

This is a list of the Academic CMP project proposals from summer 2022.

Dynamical modelling of mutation as an immune defence 

Project Title Dynamical modelling of mutation as an immune defence
Keywords Disease dynamics, virus evolution, computational modelling
Contact Name Jordan Skittrall
Contact Email jps55@cam.ac.uk
Company/Lab/Department Department of Pathology
Address Division of Virology, Addenbrooke's Hospital, Cambridge, CB2 0QQ
Period of the Project 8 weeks
Work Environment The intention is that you will work in the Division of Virology, Department of Pathology, which is on the Addenbrooke's site in Cambridge. Should it be necessary to work remotely this will be possible, but the aim would be to work in person to maximise opportunities for discussion of ideas. Hours are flexible but you should expect to work standard working weeks in total. Jordan Skittrall (Pathology) will be the day-to-day supervisor and contact. There will be the opportunity to get to know other members of the division and to join any meetings taking place during the placement period.
Project Open to Undergraduates; Master's (Part III) students
Background Information One form of immune response to viral infection involves deliberately mutating the virus to make it stop working. This is seen, for example, in HIV infection, where around an order of magnitude of the genetic diversity seen in viruses can be explained by this immune response, and where the virus makes a protein to defend against the response. It is also the mechanism by which some antiviral drugs work - including one of the drugs recently licensed for use in treating COVID-19. But it is a potentially risky strategy - mutations also allow viruses to escape immune responses and develop drug resistance.
Brief Description of the Project

The aim of this project is to develop a model to help us understand better how key parameters affect the balance between mutation helping the host, and mutation helping the virus. In particular, the aim is to understand to what extent both outcomes may occur in the same population if there is variability in the parameters within the population, and to estimate how often such outcomes occur. In the project, you will develop a simplified model capturing the key aspects of the mutation process, and implement and refine that model in one (or possibly more than one) infection scenario. To understand the parameter space developed, you will need to implement basic code to describe possible outcomes. This will be sufficiently straightforward that you can learn as you go if you have not done this before, but you should have basic familiarity with at least one programming language.

Funding for this project is not guaranteed, and if interested in this project you are encouraged to make contact as soon as possible, as many funding opportunities require a named student in the application and earlier contact will allow us to apply before deadlines for more funders. You will not be required to respond to any project offer before the common deadline for the placements.

References

A background to the way the mutational arms race proceeds in HIV as an example (although we may work on a problem with a simpler setup) is given in the following references:

  1.  Armitage et al. "APOBEC3G-Induced Hypermutation of Human Immunodeficiency Virus Type-1 Is Typically a Discrete  "All or Nothing" Phenomenon" (https://doi.org/10.1371/journal.pgen.1002550
  2. Sadler et al. "APOBEC3G Contributes to HIV-1 Variation through Sublethal Mutagenesis" (https://doi.org/10.1128/JVI.00056-10) [You do not need to understand the detail of the experiments performed in this paper, but should understand that it gives evidence that the real-world situation may be less clear-cut than suggested in reference (1).] A introduction to a completely different way this problem can be manifested is found in the following reference, which explains how molnupiravir, now licensed as a COVID-19 treatment, works: 
  3. Kabinger et al. "Mechanism of molnupiravir-induced SARS-CoV-2 mutagenesis" (https://doi.org/10.1038/s41594-021-00651-0)
Prerequisite Skills Statistics;Dynamical Systems
Other Skills Used in the Project Probability/Markov Chains;Data Visualization;Mathematical Biology
Programming Languages Python;MATLAB;Mathematica

 

SARS-CoV-2 pandemic-scale phylogenetics 

Project Title SARS-CoV-2 pandemic-scale phylogenetics
Keywords Phylogenetics, mathematical modeling, molecular evolution, epdiemiological modeling
Contact Name Nicola De Maio
Contact Email demaio@ebi.ac.uk
Company/Lab/Department EMBL-EBI
Address The European Bioinformatics Institute (EMBL-EBI) Wellcome Genome Campus Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
Period of the Project 8 weeks or preferentially more
Work Environment The student will be supervised day by day by a senior postdoc in the group, interact regularly with 2 other PhD students in the group, attend weekly group meetings, and be supervised weekly by the group leader.
Project Open to Undergraduates;Master's (Part III) students
Background Information The COVID-19 pandemic has been accompanied by an extremely intense and widespread sequencing effort, resulting in millions of SARS-CoV-2 genomes available to researcher to help investigate the virus spread and evolution. However, existing data analysis and mathematical modeling methods struggle to deal with this mole of data.
Brief Description of the Project The students will contribute to methods developed in our lab to efficiently and acurately investigate SARS-CoV-2 genome data, and, more generally, will work on phylogenetic methods that infer evolutionary history from DNA sequence data. The project will involve the development of efficient algorithms and mathematical models and their implementation, either in Python, C++ or Java. The project can involve the development of Markov model for phylogenetics or the use of neural networks for molecular evolution. The ideal outcome would be the development of novel mathematical models to acurately describe complex scenarios of DNA evolution in a computationally efficient manner.
References
  1. https://www.nature.com/articles/s41588-021-00862-7
  2. https://doi.org/10.1093/gbe/evab087
  3. https://doi.org/10.1101/2021.03.15.435416
Prerequisite Skills Probability/Markov Chains;Programming, preferentially in Python, but Java and C++ would also be extremely useful.
Other Skills Used in the Project Statistics;Simulation
Programming Languages Python;C++;Java

 

Discovering regions of functional importance in RNA viruses (13 February Deadline) 

Project Title Discovering regions of functional importance in RNA viruses
Keywords Computational modelling, virus evolution, bioinformatics, mathematical biology
Contact Name Jordan Skittrall
Contact Email jps55@cam.ac.uk
Company/Lab/Department Department of Pathology
Address Division of Virology, Addenbrooke's Hospital, Cambridge, CB2 0QQ
Period of the Project 8 weeks
Work Environment The intention is that you will work in the Division of Virology, Department of Pathology, which is on the Addenbrooke's site in Cambridge. Should it be necessary to work remotely this will be possible, but the aim would be to work in person to maximise opportunities for discussion of ideas. Hours are flexible but you should expect to work standard working weeks in total.
Project Open to Undergraduate students only
Background Information The genetic sequences of all organisms we seen can be viewed as the results of a natural experiment in what is capable of surviving. By examining the sequences of these organisms we can analyse the results of that natural experiment. Viruses that replicate using RNA (a large proportion of the viruses we know) give some of the smallest examples of such sequences, making their analysis computationally tractable. By detecting regions of such viruses that have to be conserved for the virus to survive, we get pointers to elements of the viral lifecycle, and to possible drug targets.
Brief Description of the Project

This project would be especially suitable for somebody who wanted to explore moving into bioinformatics, or applying mathematical knowledge to microbiology. It will be advertised to both mathematicians and biologists, and the focus for a mathematician will be on learning the skills in handling biological data and in virology required to undertake an analysis and start interpreting the results. In this project you will apply to a set of virus genomes one of our mathematical techniques for searching for regions of interest.

You will need to:

  • develop sufficient understanding of the mathematics of the analysis pipeline to understand the implications of the mathematics for interpretation of results;
  • download and curate a set of virus sequences ready for analysis;
  • adapt code (written in Mathematica) in order to apply it to your dataset;
  • visualise the output (develop graphical methods of representing the results of your analysis);
  • be able to discuss your findings in a cross-disciplinary fashion with colleagues in mathematics in virology to come to a shared interpretation of the results (in terms of regions of interest you have identified in the viral genome).

There are a few options for viruses it would be possible to work with on this project, which can be discussed prior to application. The project is likely to focus on a virus capable of causing human disease for which drug treatment is sought. The background work underpinning this project stretches all the way from open problems in probability theory, through the mathematics of signal processing and bioinformatics, to wet lab molecular biology and clinical applications. It is a requirement of the funding stream we propose to seek for this project that the project be strongly microbiological in nature and the project background be in microbiology in such a way that students undertaking it will be exposed to core microbiology thinking. Funding for this project will be sought via application to an external body (and so cannot be guaranteed).

The deadline for the external body's funding applications necessitates an earlier deadline of *13th February 2022* for expressions of interest, although earlier expressions of interest will make funding application more straightforward.

References

These references are listed in priority order for background reading (the first two are key), but if you have time, the progression of ideas is slightly easier to follow if they are read chronologically. There is no need to have read the references prior to sending an enquiry about the project: they are included in case you would like more information. The references describe the mathematical underpinning of the techniques used, and demonstrate some previous applications of those techniques.

[1] “A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data'  Julia R. Gog, Andrew M.L. Lever, Jordan P. Skittrall. PLoS ONE, 2018 13(4):e0195763. (https://doi.org/10.1371/journal.pone.0195763)
[2] “A scale-free analysis of the HIV-1 genome demonstrates multiple conserved regions of structural and functional importance.' Jordan P. Skittrall, Carin K. Ingemarsdotter, Julia R. Gog, Andrew M.L. Lever. PLoS Comput Biol, 2019 15(9):e1007345. (https://doi.org/10.1371/journal.pcbi.1007345)
[3] “Codon conservation in the influenza A virus genome defines RNA packaging signals.' Julia R. Gog, Emmanuel Dos Santos Afonso, Rosa M. Dalton, India Leclerq, Laurence Tiley, Debra Elton, Johann C. von Kirchbach, Nadia Naffakh, Nicolas Escriou, Paul Digard. Nucleic Acids Res, 2007 (35) 1897-1907. (https://doi.org/10.1093/nar/gkm087)
[4] “Genomic analysis of codon, sequence and structural conservation with selective biochemical-structure mapping reveals highly conserved and dynamic structures in rotavirus RNAs with potential cis-acting functions.' Wilson Li, Emily Manktelow, Johann C. von Kirchbach, Julia R. Gog, Ulrich Desselberger, Andrew M. Lever. Nucleic Acids Res, 2010 (38) 7718-7735. (https://doi.org/10.1093/nar/gkq663)

Prerequisite Skills  
Other Skills Used in the Project Statistics;Database Queries;Data Visualization
Programming Languages Mathematica, small amounts of bash (it's fine to start the project with no knowledge of these)

 

Speech enhancement for hearing devices: learned sound representations versus deterministic transforms 

Project Title Speech enhancement for hearing devices: learned sound representations versus deterministic transforms
Keywords deep learning, speech enhancement, frequency domain, auditory filterbanks, speech intelligibility
Contact Name Clément Gaultier, Tobias Goehring
Contact Email Clement.Gaultier@mrc-cbu.cam.ac.uk
Company/Lab/Department Deep Hearing Lab, MRC Cognition and Brain Sciences Unit, University of Cambridge
Address Clement.Gaultier@mrc-cbu.cam.ac.uk
Period of the Project Any 8-week period from late June to early September
Work Environment The student will be part of the Deep Hearing Lab (1) based at the MRC Cognition and Brain Sciences Unit (2) and work with primary supervisor Dr. Clément Gaultier as well as secondary supervisor Dr. Tobias Goehring. The student will benefit from joining the Cambridge Hearing Group (3), a world-leading and vibrant research network in Cambridge. The candidate will have the opportunity to use applied mathematics for signal processing and learn basics of time-frequency sound analysis/synthesis. Remote working can be arranged if conditions make it difficult or not suitable to come to the office daily (depending on the evolving COVID situation).
(1): Deep Hearing Lab: https://www.deephearinglab.com
(2): MRC CBSU: https://www.mrc-cbu.cam.ac.uk
(3): Cambridge Hearing Group: https://www.hearing-research.group.cam.ac.uk
Project Open to Undergraduates;Master's (Part III) students
Background Information This project is part of a larger multidisciplinary project studying new speech enhancement strategies with the aim to improve sound perception for people using hearing devices (hearing aids, cochlear implants) in challenging listening situations with noise and reverberation.
Brief Description of the Project

Deep learning brought substantial improvements to speech recognition, enhancement or separation systems by estimating a time-frequency mask (by a masking network) in a deterministic transformed domain for time-domain sound signals (i.e. Fourier domain). The masked representation is then transformed back to a time-domain signal (i.e. Inverse Fourier transform) to yield the enhanced sound. In recent years, new artificial neural network architectures (Recurrent Neural Networks, Attention mechanisms) along with the introduction of end-to-end learning systems provided a significant boost of performance [1]. These new techniques work in an Encoder-Mask-Decoder fashion where the mask is no longer estimated from a fixed transformed domain but from learned representations available at the encoder stage. At the decoder stage, the masked representations are then transformed back to a time-domain signal ("decoded") to obtain the enhanced sound signals. This project will explore the impact of such learned representations over deterministic transforms (auditory inspired, discrete Fourier transforms, ....) in the context of noise and reverberation compensation for hearing impaired people that use hearing aids or cochlear implants.

The student will investigate the following research questions (but not limited to):
1. How does learning the encoder, decoder or both affect speech enhancement performance compared to applying deterministic auditory inspired transforms?
2. Are there any properties of learned or fixed transforms that allow complexity or dimensionality reduction with similar performance?

The outcomes of the study will be systematically evaluated using objective measures and pilot listening tests and results will contribute to an ongoing research project. The student will use common high-level (Python) deep learning frameworks for speech separation/enhancement including filterbank design tools for comparing the encoder-decoder models and deterministic transforms [2].

References [1]: Luo, Yi, and Nima Mesgarani. 'TasNet: time-domain audio separation network for real-time, single-channel speech separation.' 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018.
[2]: Pariente, Manuel, et al. 'Asteroid: the pytorch-based audio source separation toolkit for researchers.' arXiv preprint arXiv:2005.04132 (2020).
Prerequisite Skills Statistics;Algebra/Number Theory;The candidate should have basic knowledge in a programming language and be keen to learn or use Python and high-level signal processing tools. Basic knowledge of statistics and linear algebra. Interests in acoustics or speech processing is a plus.
Other Skills Used in the Project  
Programming Languages Python;MATLAB;R;C++;No Preference

 

Decoding the Neural Signature of Speech Perception

Project Title Decoding the Neural Signature of Speech Perception
Keywords Cognitive neuroscience, brain-computer interface, speech perception, audio processing
Contact Name Tobias Goehring and/or Alexis Deighton MacIntyre
Contact Email alexisdeighton.macintyre@mrc-cbu.cam.ac.uk
Company/Lab/Department Deep Hearing Lab, MRC Cognition and Brain Sciences Unit, University of Cambridge
Address alexisdeighton.macintyre@mrc-cbu.cam.ac.uk
Period of the Project 8 weeks
Work Environment The student will work together with one primary supervisor (Alexis Deighton MacIntyre/Post Doc) but will also benefit from close cooperation with other members of a friendly and vibrant lab, as well as connections within the broader Cambridge Hearing Group community, a multidisciplinary affiliation spanning brain sciences, medicine, and engineering, and the MRC-CBU, a world-leading research centre in basic and translational cognitive neuroscience. For more information, see https://www.hearing-research.group.cam.ac.uk/ and https://www.mrc-cbu.cam.ac.uk/
Project Open to Undergraduates; Master's (Part III) students
Background Information Electroencephalography (EEG) is a method to record brain activity in the form of electrical signals at the scalp. Recent analytical advances allow us to infer or reconstruct aspects of subjective, auditory experiences, such as speech perception, using a listener's neural data alone. This development may hold promise for applications in brain-computer interfaces, such as smart hearing aids or cochlear implants that aim to optimise auditory perception for people with hearing loss.
Brief Description of the Project Various techniques exist to correlate external stimuli, like acoustic recordings of speech, with EEG data. Some popular approaches include mutual information (MI) analysis and the fitting of temporal response functions (TRF) using regularised linear regression. One problem is that the resulting information and/or correlation values, though statistically robust, tend to be very small in magnitude, suggesting room for improvement. It may be that the specific choice of ground truth (e.g., engineered acoustic features) is inappropriate, or that the strength of the correspondence between stimulus and neural response varies over time, in which case a temporally dynamic approach may be preferable. Finally, data-driven models derived with machine learning (ML) may complement and/or surpass the techniques described above, which assume a simple stimulus-to-brain mapping. Using EEG data from human listeners, the student's role will entail the systematic comparison of different input acoustic features, whilst taking the form of analysis (e.g., MI, TRF) as well as temporal factors (e.g., discrete versus continuous sampling, effects of window length) into account. The objective is to determine the impact of choice of feature on the overall measure of correspondence between acoustic stimuli and brain signals. The results of this project will contribute to a bigger, ongoing project and inform optimal decoding approaches with applications in hearing research. Depending on the student's interest, there is also some opportunity to explore ML-based, non-linear methods and comparing them to more established, but constrained linear approaches.
References Crosse, M. J., Di Liberto, G. M., Bednar, A., & Lalor, E. C. (2016). The multivariate temporal response function (mTRF) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in human neuroscience, 10, 604. Ding, N., & Simon, J. Z. (2014). Cortical entrainment to continuous speech: functional roles and interpretations. Frontiers in human neuroscience, 8, 311.
Prerequisite Skills Statistics;Mathematical Analysis;Predictive Modelling
Other Skills Used in the Project Machine learning, biomedical data analysis
Programming Languages Python;MATLAB

 

Formalising Thom encoding in Isabelle/HOL 

Project Title Formalising Thom encoding in Isabelle/HOL
Keywords Mechanising mathematics, proof assistant, real algebraic geometry, computer algebra
Contact Name Wenda Li
Contact Email wl302@cam.ac.uk
Company/Lab/Department Department of Computer Science and Technology, University of Cambridge
Address William Gates Building JJ Thomson Avenue Cambridge. CB3 0FD
Period of the Project 8 weeks
Work Environment The student will mainly collaborate with me, but he/she is also welcome to chat with others in the ALEXANDRIA group (https://www.cl.cam.ac.uk/~lp15/Grants/Alexandria/). There is a possibility that the project will be remote depending on the situation of the pandemic.
Project Open to Undergraduates;Master's (Part III) students
Background Information Modern proof assistants allow users to interact with computers to mechanise mathematical theorems and their proofs. Here, all derivation steps will be mechanically checked, so that ambiguities and errors in normal hand-written proofs can be eliminated. Recent progress in proof assistants includes Grothendieck's Schemes in the Isabelle proof assistant [1] and some recent results from Peter Scholze being mechanised in the Lean proof assistant [2]. This project will involve doing mechanised proofs within the Isabelle proof assistant.
Brief Description of the Project Real algebraic numbers are usually encoded as an univariate integer polynomial P and an interval (with rational end points) such that there is exactly one root of P within this interval. However, this encoding is not sufficient in the field of rationals extended with real algebraic numbers and infinitesimals, since a polynomial with infinistesimal coefficients can have roots that can be isolated with a rational interval. To addresss this problem, we may need Thom encoding to distinguish polynomial roots in a non-archimedean field as has been implemented in the Z3 SMT solver [4]. In the project, the goal is to formalise the fundamental property of Thom encoding in Isabelle/HOL (Proposition 2.28. in Bochnak, Coste and Roy [3]).
References [1] Bordg, Anthony, Lawrence Paulson, and Wenda Li. "Simple Type Theory is not too Simple: Grothendieck's Schemes without Dependent Types." arXiv preprint arXiv:2104.09366(2021).
[2] Castelvecchi, Davide. "Mathematicians welcome computer-assisted proof in'grand unification'theory." Nature (2021).
[3] Bochnak, J., Coste, M. and Roy, M. F. (2013). Real algebraic geometry (Vol. 36). Springer.
[4] Grant Passmore and Leonardo de Moura {2013}. Computation in Real Closed Infinitesimal and Transcendental Extensions of the Rationals. Proceedings in the 24th International Conference on Automated Deduction (CADE-24).
Prerequisite Skills Mathematical Analysis;Prior knowledge of proof assistants (e.g., Coq, Lean, Isabelle) is preferred but not required
Other Skills Used in the Project  
Programming Languages Isabelle

 

Representing plant hydraulics and plant water stress in a dynamic global vegetation model 

Project Title Representing plant hydraulics and plant water stress in a dynamic global vegetation model
Keywords Water Stress Plants ABA Stomata
Contact Name Prof. Andrew D. Friend
Contact Email adf10@cam.ac.uk
Company/Lab/Department Department of Geography
Address Department of Geography, University of Cambridge, Downing Place, Cambridge CB2 3EN
Period of the Project 8 weeks summer 2022
Work Environment The project is set up to complement current work in our research group and give the student a realistic research experience, so there is some room to customize the project based on the student's interests and skills. The project will be performed using MATLAB (or similar programming language) for the data analysis and FORTRAN for the implementation of sub models in HYBRID. Model runs will be carried out on the Cambridge CSD3 cluster. The student will work closely with a PhD student.
Project Open to Undergraduates; Master's (Part III) students
Background Information In a changing climate the global hydrological cycle will alter significantly (Abbott et al., 2019) and drought conditions will become more frequent and intermittent (Greve et al., 2019; Pokhrel et al., 2021; Samaniego et al., 2018). Under these conditions it is critical for plants to manage their water economy efficiently. Modeling plant water stress is a key component of dynamic global vegetation models (Bonan et al., 2014; Eller et al., 2020; Kennedy et al., 2019) as it is necessary to correctly predict plant productivity and mortality during periods of drought (Stocker et al., 2019; Trugman et al., 2018).
Brief Description of the Project In this project we would like you to work on implementing alternative stomatal conductance models and water stress functions into the HYBRID vegetation model, as well as further representing plant signaling pathways that lead to stomatal closure, regulating water loss. One signaling pathway that could be included is abscisic acid (ABA) concentration, which is a well-known driver of stomatal closure. This would include creating a sub-model for ABA synthesis, transport, and sequestration, as well as the changing sensitivity of stomatal conductance to ABA concentration (could be based on the work of Dewar, 2002). Furthermore, resulting plant water fluxes during drought can be compared and analyzed in respect to microwave observations of vegetation water content.
References
  • Abbott, B.W., Bishop, K., Zarnetske, J.P., Minaudo, C., Chapin, F.S., Krause, S., Hannah, D.M., Conner, L., Ellison, D., Godsey, S.E., Plont, S., Marçais, J., Kolbe, T., Huebner, A., Frei, R.J., Hampton, T., Gu, S., Buhman, M., Sara Sayedi, S., Ursache, O., Chapin, M., Henderson, K.D., Pinay, G., 2019. Human domination of the global water cycle absent from depictions and perceptions. Nat. Geosci. 12, 533-540. https://doi.org/10.1038/s41561-019-0374-y
  • Bonan, G.B., Williams, M., Fisher, R.A., Oleson, K.W., 2014. Modeling stomatal conductance in the earth system: linking leaf water-use efficiency and water transport along the soil-plant-atmosphere continuum. Geosci. Model Dev. 7, 2193-“2222. https://doi.org/10.5194/gmd-7-2193-2014
  • Dewar, R.C., 2002. The Ball-Berry-Leuning and Tardieu-Davies stomatal models: synthesis and extension within a spatially aggregated picture of guard cell function. Plant. Cell Environ. 25, 1383-1398. https://doi.org/10.1046/j.1365-3040.2002.00909.x
  • Eller, C.B., Rowland, L., Mencuccini, M., Rosas, T., Williams, K., Harper, A., Medlyn, B.E., Wagner, Y., Klein, T., Teodoro, G.S., Oliveira, R.S., Matos, I.S., Rosado, B.H.P., Fuchs, K., Wohlfahrt, G., Montagnani, L., Meir, P., Sitch, S., Cox, P.M., 2020. Stomatal optimization based on xylem hydraulics (SOX) improves land surface model simulation of vegetation responses to climate. New Phytol. 226, 1622-1637. https://doi.org/10.1111/nph.16419
  • Greve, P., Roderick, M.L., Ukkola, A.M., Wada, Y., 2019. The aridity Index under global warming. Environ. Res. Lett. 14, 124006. https://doi.org/10.1088/1748-9326/ab5046
  • Kennedy, D., Swenson, S., Oleson, K.W., Lawrence, D.M., Fisher, R., Lola da Costa, A.C., Gentine, P., 2019. Implementing Plant Hydraulics in the Community Land Model, Version 5. J. Adv. Model. Earth Syst. 11, 485-513. https://doi.org/10.1029/2018MS001500
  • Pokhrel, Y., Felfelani, F., Satoh, Y., Boulange, J., Burek, P., Gädeke, A., Gerten, D., Gosling, S.N., Grillakis, M., Gudmundsson, L., Hanasaki, N., Kim, H., Koutroulis, A., Liu, J., Papadimitriou, L., Schewe, J., Müller Schmied, H., Stacke, T., Telteu, C.-E., Thiery, W., Veldkamp, T., Zhao, F., Wada, Y., 2021. Global terrestrial water storage and drought severity under climate change. Nat. Clim. Chang. https://doi.org/10.1038/s41558-020-00972-w
  • Samaniego, L., Thober, S., Kumar, R., Wanders, N., Rakovec, O., Pan, M., Zink, M., Sheffield, J., Wood, E.F., Marx, A., 2018. Anthropogenic warming exacerbates European soil moisture droughts. Nat. Clim. Chang. 8, 421-426. https://doi.org/10.1038/s41558-018-0138-5
  • Stocker, B.D., Zscheischler, J., Keenan, T.F., Prentice, I.C., Seneviratne, S.I., Peñuelas, J., 2019. Drought impacts on terrestrial primary production underestimated by satellite monitoring. Nat. Geosci. 12, 264-270. https://doi.org/10.1038/s41561-019-0318-6
  • Trugman, A.T., Medvigy, D., Mankin, J.S., Anderegg, W.R.L., 2018. Soil Moisture Stress as a Major Driver of Carbon Cycle Uncertainty. Geophys. Res. Lett. 45, 6495-6503. https://doi.org/10.1029/2018GL078131
Prerequisite Skills Predictive Modelling
Other Skills Used in the Project  
Programming Languages MATLAB; Fortran

 

Climate Repair: Ice Thickening 

Project Title Climate Repair: Ice Thickening
Keywords Arctic, ice, geoengineering, partial differential equations, numerical modelling
Contact Name Katie Parker (Project Manager), Professor Hugh Hunt (Supervisor)
Contact Email kvp24@dow.cam.ac.uk
Company/Lab/Department Centre for Climate Repair / Engineering
Address Centre for Climate Repair, Downing College
Period of the Project 8-10 weeks at a date to be agreed between June and Sept
Work Environment The project can be undertaken remotely or in person, subject to discussion at interview. The team typically work office hours (9-5) but are flexible.
Project Open to Undergraduates; Master's (Part III) students
Background Information The Arctic is melting fast. The Centre for Climate Repair (CCRC) is looking at technologies that might slow down or reverse this melting. One idea is to spray seawater onto existing ice during the cold winter thereby thickening it so that it will last through the Arctic summer. A fourth-year engineering student has developed ice-thickening experiments and has made interesting measurements with water flowing in a channel inside a freezer at -18°C, but there is a need for a mathematical model to enable the measurements to be properly interpreted.
Brief Description of the Project This project is ideally suited to an applied mathematician or engineer who is comfortable with partial differential equations and numerical methods. The balance between heat transfer, sensible heat and latent heat in flowing water is formulated in terms of distance and time. As if this isn't complicated enough, salt water as it freezes develops a salt-concentration gradient which needs to be included. If we have a good model, then we can use it to design methodologies for creating ice in the Arctic.
References Peter Wadhams 'A Farewell to Ice''; Centre for Climate Repair website, especially working papers - https://www.climaterepair.cam.ac.uk/working-papers; a couple of papers from last year's internships available on request from kvp24@dow.cam.ac.uk
Prerequisite Skills Numerical Analysis; Mathematical Analysis; Predictive Modelling
Other Skills Used in the Project  
Programming Languages Python; MATLAB

 

Integrating microvascular biophysics with graph neural networks 

Project Title Integrating microvascular biophysics with graph neural networks
Keywords Machine learning; biophysics; blood vessels; graph neural networks; modelling
Contact Name Dr Paul Sweeney
Contact Email Paul.sweeney@cruk.cam.ac.uk
Company/Lab/Department Bohndiek Lab, Cancer Research UK Cambridge Institute, University of Cambridge
Address Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE
Period of the Project 8 weeks (flexible)
Work Environment Lab based - hours typically 10am - 4pm. Lab consists of several post-docs and PhD students. Can flexibly work from home at times.
Project Open to Undergraduates; Master's (Part III) students
Background Information Many real-world datasets can be define in the form of a graph, for example, social networks, protein interactions and road networks. An active area of interest in machine learning are graph neural networks (GNNs) which operate on graph data. Subsequent application of GNNs has led to progress in fake news detection, antibacterial discovery and traffic detection. Biomedical imaging can extract information on the structure of large microvascular networks (10^6 vessels) which can be used as input into mathematical models to investigate biological transport phenomena. These networks of blood vessels can also be represented graphs and so are amenable to GNNs. Due to the ever increasing size of these datasets, as a result of advances in imaging, it would be interesting to see if GNNs can rapidly generate accurate predictions relating to the inherent structure of these networks, in addition to predicting more complex biophysics.
Brief Description of the Project The aim of the project will be for the student to build their own graph neural network in Python (using standard APIs e.g., Tensorflow / Keras) to form useful predictions relating to properties of vascular networks. Students will generate their own synthetic vascular graphs using existing software, to build a library of synthetic data, as well as utilise existing blood vessel structural datasets obtained via biomedical imaging. These data could be used to model blood flow using existing packages (C++, no experience needed) as well act as input to their GNN. Initially the student will develop their GNN for undirected graphs to predict basic properties of the graph (e.g., path of least resistance). Next the GNN will be developed to incorporate directional graphs to be trained against their blood flow simulations, enabling their GNN to predict biophysical properties for any arbitrary graph used as input. A successful project is deemed as one where the student gains competence and confidence in coding, model design and application. The project is open-ended, evolving as news ideas arise and dependent on student progress.
References
Prerequisite Skills Simulation; Predictive Modelling; A basic understanding of machine learning & coding experience
Other Skills Used in the Project Statistics; Mathematical physics; PDE's
Programming Languages Python; C++

 

Formalisation of material in number theory/additive combinatorics using Isabelle/HOL

Project Title Formalisation of material in number theory/additive combinatorics using Isabelle/HOL
Keywords number theory, additive combinatorics, proof assistants, interactive theorem proving, Isabelle/HOL
Contact Name Dr. Angeliki Koutsoukou-Argyraki
Contact Email ak2110@cam.ac.uk
Company/Lab/Department University of Cambridge, Department of Computer Science and Technology (Computer Laboratory)
Address 15 JJ Thomson Avenue CB30FD Cambridge
Period of the Project 8 weeks
Work Environment ALEXANDRIA group, please see https://www.cl.cam.ac.uk/~lp15/Grants/Alexandria/. We will be working remotely.
Project Open to Undergraduates; Master's (Part III) students
Background Information  
Brief Description of the Project The student will participate in a project involving the formalisation of material in number theory/additive combinatorics using the proof assistant (interactive theorem prover) Isabelle/HOL.
References Recent related work:
Prerequisite Skills Mathematical Analysis; Algebra/Number Theory
Other Skills Used in the Project Previous experience in Isabelle/HOL or other proof assistants is desirable but not necessary. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html
Programming Languages Isabelle/HOL. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html

 

Deep learning for microscopy image reconstruction 

Project Title Deep learning for microscopy image reconstruction
Keywords deconvolution, deep learning, lightsheet microscopy, image reconstruction
Contact Name Leila Muresan
Contact Email lam94@cam.ac.uk
Company/Lab/Department Dept. of Physiology, Development and Neuroscience
Address Anatomy building, CB2 3DY
Period of the Project 8 weeks
Work Environment The student will be part of an on-going collaboration between , MRC-LMB (Jerome Boulanger), DAMTP (Yury Korolev) and Quantitative Biology Institute, Yale University (Bogdan Toader) and PDN (Leila Muresan). There will be weekly meetings of the entire team (possibly online), otherwise the schedule and location arrangements are flexible.
Project Open to Master's (Part III) students
Background Information In recent years. machine learning and especially deep-learning techniques had a huge impact on microscopy image analysis. For instance, the solutions of difficult segmentation tasks were hugely improved for both 2d and 3d data, and several image reconstruction and denoising methods have been designed that currently constitute the state of the art. Deep learning has also been used to model the data for single molecule localization microscopy providing insightful forward models.
Brief Description of the Project This project will focus on combining learned regularization and forward models for solving deconvolution problems in a mathematically coherent manner. The goal is to exploit the flexibility of learned models and the high performance of learned regularizers over traditional regularization methods while maintaining mathematical guarantees for the reconstructed images. (The goal of the project is open-ended, the candidate will contribute to ongoing work on lightsheet microscopy deconvolution).
References  
Prerequisite Skills  
Other Skills Used in the Project Numerical Analysis; Image processing; Simulation
Programming Languages No Preference

 

Bayesian machine learning and theory in cosmology and particle physics 

Project Title Bayesian machine learning and theory in cosmology and particle physics
Keywords Bayesian Inference; Machine Learning; Cosmology;
Contact Name Dr Will Handley
Contact Email wh260@cam.ac.uk
Company/Lab/Department Kavli Institute for Cosmology/Cavendish Laboratory
Address Kavli Institute for Cosmology/Cavendish Laboratory
Period of the Project 8-12 weeks (depending on funding)
Work Environment Working with my postdocs and PhD students in the KICC
Project Open to Undergraduates; Master's (Part III) students
Background Information  
Brief Description of the Project

In this project the student will work with Dr Handley and his team investigating the development and application of Bayesian machine learning techniques to modern and future cosmological and particle physics datasets.

The precise details of the project will be tailored to the student interest and skill set, but possible topics/projects include
1. Developing machine learning algorithms for nested sampling and applying these to cosmological data sets - https://arxiv.org/abs/1506.00171 - https://arxiv.org/abs/2007.08496
2. Model independent reconstruction of the primordial universe from cosmic microwave background data and cosmic dawn data - https://arxiv.org/abs/1908.00906
3. Developing and applying mathematical schemes for disentangling physical signatures in the primordial universe - https://arxiv.org/abs/1907.08524 - https://arxiv.org/abs/2009.05573
4. Combining particle physics and cosmological data as part of the GAMBIT team - https://gambit.hepforge.org/ - https://arxiv.org/abs/2009.03286 - https://arxiv.org/abs/2009.03287
5. Investigating quantum initial conditions for inflation - https://arxiv.org/abs/2112.07547 - https://arxiv.org/abs/1607.04148

Over the course of the project students can expect to learn some/all of:

  • up-to-date cosmological research questions
  • Science grade python
  • High performance computing
  • Bayesian inference
  • Machine learning
  • Computer algebra

Essential:

  • Three years of undergraduate physics, mathematics or equivalent
  • Basic to intermediate Python experience and good programming skills
  • Strong mathematical skills

Desirable:

  • Interest/Knowledge of general relativity/cosmology
  • Experience using Mathematica/Maple/Computer algebra
References  
Prerequisite Skills  
Other Skills Used in the Project  
Programming Languages Python; Mathematica or Maple

 

Novel Flexible Polyhedra

Project Title Novel Flexible Polyhedra
Keywords geometry, computer programming
Contact Name Simon Guest
Contact Email sdg@eng.cam.ac.uk
Company/Lab/Department Engineering Department
Address sdg@eng.cam.ac.uk
Period of the Project 8 weeks
Work Environment The student will be part of a small group in the Civil Engineering Building in West Cambridge.
Project Open to Undergraduates; Master's (Part III) students
Background Information Cauchy showed that all convex polyhedra are rigid, but wasn't until the 1970s that Bob Connelly found a non-convex polyhedron that was flexible. A related result is Alexandrov's uniqueness theorem, that shows that any polyhedron with a given metric (i.e. that can be folded from a net) has a unique convex realisation - a constructive proof of this was only found by Bobenko and Izmestiev in 2008. These results are closely connected to recent work in rigid origami, and a better understanding of the connection between them might allow us to develop novel foldable structures, for instance for use in spacecraft.
Brief Description of the Project The project will start by developing a computer implementation of Bobenko and Izmestiev's algorithm for finding convex realisations for polyhedra, and then examine whether this algorithm can be modified to understand better the behaviour of (clearly non-convex) flexible polyhedra. This might give us insight to develop new flexible polyhedra, for instance novel forms that are not fully triangulated.
References Much of the background is given in the recent book 'Frameworks, Tensegrities, and Symmetry' published by CUP, and available electronically from the University Library.
Prerequisite Skills Geometry/Topology
Other Skills Used in the Project  
Programming Languages Python; MATLAB

 

Quantum stability of a novel theory of gravity

Project Title Quantum stability of a novel theory of gravity
Keywords modified gravity, torsion, computer algebra, effective field theory, cosmology
Contact Name Will Barker
Contact Email wb263@cam.ac.uk
Company/Lab/Department Cavendish astrophysics & KICC
Address Office K34, Kavli Institute for Cosmology, Cambridge, CB3 0HA
Period of the Project 8 weeks
Work Environment The student may work remotely, but would ideally have a desk at the Kavli Institute for Cosmology, Cambridge (KICC). We have a small modified gravity nexus belonging to the Cavendish Astrophysics Group, comprising Professors Lasenby and Hobson and myself. The broader environment in the Kavli is dominated by theoretical, observational and statistical cosmologists. The student would be encouraged to take advantage of seminars and networking both at the Kavli and the CMS. There are also opportunities to liaise remotely with astroparticle theorists at CEICO in Prague and at the Instituut Lorentz in Leiden. We have free coffee.
Project Open to Undergraduates; Master's (Part III) students
Background Information

Einstein's General Relativity (GR) remains the preferred effective theory of gravity as spacetime curvature, explaining the orbital precession of Mercury and solar bending of starlight while underpinning modern cosmology. However GR does not explain dark matter or dark energy, while an alleged `Hubble tension' indicates that our Universe is expanding 10% faster than it should be. And of course, GR continues to stubbornly resist attempts at (complete) quantum reformulation.

We recently attracted some attention by proposing an alternative theory of gravity as a blend of spacetime torsion and curvature: this appears to provide a cosmological constant and alleviate the Hubble tension. Our Lagrangian is wildly different from that of GR, with a quantum structure suggestive of renormalisability. We believe the theory adopts a torsion vacuum expectation value (VEV) at a primordial epoch, on the back of which the good classical phenomena emerge. However many quantum/classical aspects of this torsion VEV, and the violent early-Universe physics of its formation, remain shrouded in mystery...

Brief Description of the Project

The findings of the project may debunk our theory or, if we are lucky, propel it further into the spotlight of community interest. The student may wish to target one of several new fronts we are opening in our research campaign:

1) *Quantum stability and the infrared* -- This is an urgent question; we don't really know if the torsion VEV is stable against quantum fluctuations, as is the case for the Minkowski vacuum of GR. The student will apply well established effective field theory and ghost condensate techniques to characterise the infrared environment of the VEV. Extensions to the ultraviolet are of course welcome depending on expertise, though expected to be more challenging. A stable vacuum is quite a big deal, while convincing instabilities would seem to rule our theory out: either way this avenue promises high returns.

2) *Cosmological perturbation theory* -- There is a very well established theory dictating how cosmic density perturbations evolve under gravity, which supports GR to amazing precision based on tiny anisotropies in the cosmic microwave background and the clustering of matter on the grandest scales. The student will extract and characterise the classical perturbation equations (and perhaps convenient gauges) around the torsion VEV, matching against GR where possible. This also targets aspects of the infrared environment, but uses classical methods so does not require prior knowledge of QFT. Apart from offering a neat standalone stability test the perturbation theory will facilitate, in the long run (2023), sophisticated Monte Carlo tests against cosmological survey data: to this end a successful student would also have a stake in these future research way points.

3) *Primordial symmetry breaking* -- We imagine that the Big Bang left our gravity theory in a torsionless conformal phase, bathed in the standard model plasma. So how and when does the torsion VEV form in relation to the condensation of the Higgs field? Could this process have driven inflation, the violent expansion thought to have occurred in the very early Universe? A decaying deviation from the torsion VEV just after inflation can alleviate the Hubble tension: what physics sets this initial condition? The student may wish to merely explore these questions using the background cosmology equations, and there is a viable research-grade project at this level. However depending on interest/experience in electroweak symmetry breaking or effective quantum theories of inflation, we may hope to propose a novel inflationary mechanism.

These topics are not exhaustive and are subject to shift as we study the theory throughout spring 2022.

References
Prerequisite Skills Mathematical physics; PDEs; familiarity with GR and (very) introductory cosmology
Other Skills Used in the Project Data Visualization; QFT and cosmological perturbation theory are a bonus according to chosen topic.
Programming Languages Python; Mathematica (you can get a free license from Maths dept.!), maybe Maple if you prefer it.

 

R&D portfolio optimisation

Project Title R&D portfolio optimisation
Keywords R&D portfolio, biopharma, monte carlo simulation, predictive analytics
Contact Name Nektarios Oraiopoulos
Contact Email no245@cam.ac.uk
Company/Lab/Department Judge Business School
Address Trumpington Street, CB2 1AG
Period of the Project 8-10 weeks
Work Environment Student will work mostly on their own and will have regular meetings with academic supervisor. Working remotely is fine. Would be good to have some meetings face to face.
Project Open to Undergraduates; Master's (Part III) students
Background Information Managing the R&D pipeline in a biopharmaceutical company is one of the most significant challenges in the industry. Decision have to be made regarding what projects should be advanced in the next stage (and therefore consume significant financial resources of the company) and that projects should be put on-hold or terminated. Those decisions are made under significant uncertainty: each project has a likelihood of success, and specifically estimates rates of false positive (the current data look promising, but actually the project is doomed to fail) or false negatives (the current data look weak, but actually the project will work, if given resources).
Brief Description of the Project The student will work closely with Dr Oraiopoulos (https://www.jbs.cam.ac.uk/faculty-research/faculty-a-z/nektarios-oraiopo...) to develop an R&D portfolio optimisation model. The optimisation model will have as inputs estimates regarding the cost of each project, the false positive and negative rates, the commercial potential, etc. and it will calculate (and visualize) the expected reward and risk of different portfolios (allowing the decision-maker to select the most promising one). E.g., the model might suggest that the decision-maker should select only 6 out of the 10 current projects. A key characteristic of the model should be that it compares portfolios of projects rather than single projects. The student might also be given access to large datasets that would allow her/him to estimate those false positive/negative rates using predictive models. The student will also receive feedback from experienced executives from the pharmaceutical industry that have created drugs that transformed the industry.
References
Prerequisite Skills Statistics; Probability/Markov Chains; Simulation; Predictive Modelling; Data Visualization
Other Skills Used in the Project  
Programming Languages Python; MATLAB; R

Thawing frozen mutations in an ancient transmissible cancer 

Project Title Thawing frozen mutations in an ancient transmissible cancer
Keywords Genomics, Cancer, Evolution, Mutational Signatures
Contact Name Kevin Gori
Contact Email kcg25@cam.ac.uk
Company/Lab/Department Department of Veterinary Medicine
Address Department of Veterinary Medicine, Madingley Road, Cambridge, CB3 0ES
Period of the Project 20 June - 19 August (some flexibility)
Work Environment The student will work as part of the Transmissible Cancer group, based at the Department of Veterinary Medicine. Preferably they will spend at least three days a week in the lab.
Project Open to Undergraduates; Master's (Part III) students
Background Information The project will focus on studying transmissible cancer, which is a rare class of cancer that has developed the ability to infect new hosts. The cancer in question is Canine Transmissible Venereal Tumour, which affects dogs worldwide. The project aims to use genomic sequences of CTVT to examine early events that took place in its many-centuries long evolution, that may help to explain how it arose in the first place.
Brief Description of the Project

This project will be based at the Department of Veterinary Medicine, and will involve genomic analysis of the canine transmissible venereal tumour (CTVT). CTVT is a transmissible cancer, which is a rare class of cancer that has developed the ability to infect new hosts, and behaves rather like a parasitic organism. CTVT is spread among dogs when they come in direct physical contact with tumour tissue infecting another individual, usually during mating. CTVT is by far the oldest clonally reproducing cancer known on Earth.

The project will build on recent work done in our lab to unravel the earliest events that befell the tumour in its progression towards becoming infectious and globally endemic. Using high coverage DNA sequencing information from several CTVT samples, as well as from the uninfected tissue of their hosts, we have previously estimated the evolutionary tree that relates these tumours. This work has identified a previously unseen mutational signature (‘signature A’) that was active during the early evolution of CTVT, but was later switched off. In this project we will take estimates of genomic copy number in our samples, and use these to find genomic regions that have been duplicated in CTVT’s development. From these duplications we can identify ‘frozen time points’: fragments of DNA sequence that were gained during the early evolution of the cancer. The relative ages of these fragments will be estimated by the degree to which they have accumulated new mutations. Combined with this timing information, by examining the fragments for the presence of signature A we will be able to determine whether signature A occurred continuously, or in bursts. Additionally, the earliest fragments will be highly representative of the genotype of the animal in which CTVT arose, illuminating perhaps the characteristics of the earliest domesticated dogs.

References Strakova, Andrea, and Elizabeth P. Murchison. 2015. “The Cancer Which Survived: Insights from the Genome of an 11000 Year-Old Cancer.” Current Opinion in Genetics & Development 30: 49–55. Baez-Ortega, Adrian, Kevin Gori, Andrea Strakova, Janice L. Allen, Karen M. Allum, Leontine Bansse-Issa, Thinlay N. Bhutia, et al. 2019. “Somatic Evolution and Global Expansion of an Ancient Transmissible Cancer Lineage.” Science 365 (6452): eaau9923. Leathlobhair, Máire Ní, Angela R. Perri, Evan K. Irving-Pease, Kelsey E. Witt, Anna Linderholm, James Haile, Ophelie Lebrasseur, et al. 2018. “The Evolutionary History of Dogs in the Americas.” Science 361 (6397): 81–85.
Prerequisite Skills Statistics; Data Visualization
Other Skills Used in the Project Probability/Markov Chains; Simulation
Programming Languages Python; R

Algebraic geometry and problem-based learning 

Project Title Algebraic geometry and problem-based learning
Keywords algebraic geometry, schemes, mathematical exposition, scientific writing, mathematics education
Contact Name Anthony Bordg
Contact Email apdb3@cam.ac.uk
Company/Lab/Department Department of Computer Science and Technology
Address William Gates Building, JJ Thomson Avenue, Cambridge CB3 0FD
Period of the Project 4 weeks
Work Environment The student will work with Anthony Bordg. Remote work possible.
Project Open to Undergraduates; Master's (Part III) students
Background Information   
Brief Description of the Project This project will deal with issues in teaching and learning a very abstract field of mathematics: algebraic geometry. The point of view of a student is desirable and will be valued. Together with Anthony Bordg the student will work on completing an exposition of schemes in algebraic geometry that intends to fill a wide gap in the literature between nontechnical presentations and advanced textbooks. The final goal is the publication of an expository article on the topic in a mathematics journal, e.g. Emergent Scientist, Rocky Mountain Journal of Mathematics ... The student will be given extensive guidance and training in mathematical writing.
References Anthony Bordg, "What is a Scheme in Algebraic Geometry? A Problem-Oriented Approach", https://drive.google.com/file/d/19hsesOZl70hmzYxcV_OgINOgIg2SBD1q/view
Prerequisite Skills Geometry/Topology; Algebra/Number Theory; algebraic geometry
Other Skills Used in the Project  
Programming Languages  

 

Formalising Modular Forms and Dirichlet Series in Isabelle/HOL 

Project Title Formalising Modular Forms and Dirichlet Series in Isabelle/HOL
Keywords number theory, modular forms, interactive theorem proving, type theory, Isabelle/HOL
Contact Name Anthony Bordg
Contact Email apdb3@cam.ac.uk
Company/Lab/Department Department of Computer Science and Technology
Address William Gates Building, JJ Thomson Avenue, Cambridge CB3 0FD
Period of the Project 8 weeks
Work Environment Your supervisor will be Anthony Bordg, but you will interact with the whole ALEXANDRIA team led by Prof. Larry Paulson.
Project Open to Undergraduates; Master's (Part III) students
Background Information  
Brief Description of the Project You will work on a formalisation in Isabelle/HOL of Apostol's textbook "Modular Functions and Dirichlet Series in Number Theory". The main definitions and statements have already been formalised (see the GitHub repo in References), hence you will focus on proving these statements with the help of Isabelle/HOL efficient automation. This could lead to a pioneering and high-impact work in the fast-growing field of the formalisation of mathematics.
References - Modular Functions and Dirichlet Series in Isabelle/HOL, https://github.com/AnthonyBordg/Number_Theory (GitHub repo) - Tom Apostol, Modular Functions and Dirichlet Series in Number Theory, Springer - Isabelle Zulip chat: https://isabelle.zulipchat.com
Prerequisite Skills Mathematical Analysis; Algebra/Number Theory; complex analysis
Other Skills Used in the Project Previous experience with Isabelle/HOL or any other proof assistant (Coq, Lean ...) is desirable but not necessary. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html
Programming Languages Isabelle/HOL

Bayesian Learning for Automation 

Project Title Bayesian Learning for Automation
Keywords Bayesian learning; Neural networks; MCMC; Reinforcement Learning
Contact Name Sumeetpal S. Singh
Contact Email sss40@cam.ac.uk
Company/Lab/Department Engineering
Address Department of Engineering, Trumpington Street, CB21PZ
Period of the Project 10-12 weeks (late June/early July start))
Work Environment Join a team involving 2 other PhD students working on related topics. Based at the Department of Engineering with a mixture of remote and in-person presence. Periodic updates and discussions with sponsors Mathworks.
Project Open to Undergraduates; Master's (Part III) students
Background Information One very promising technique for automation is to gather data form an expert demonstration and then learn the expert's policy using Bayesian inference. The learnt policy is then extrapolated to automate the task in novel settings. The potential applications of this approach are numerous, e.g. automated navigation. The key challenges of this technique of ``control by mimicry'' are: 1. Learning the expert's policy, a function, and accurately representing uncertainty. 2. To improve knowledge of the expert's policy, more data needs to be gathered. This should be done sparingly, as data gathering can be expensive, and also be guided by an optimality criterion, such as an Information theoretic criterion. Impact on academic area, on user community, industry, and beyond: This is an exciting and novel endeavour that exploits recent advances in Statistics for challenging automation problems. [1] Simon L Cotter, Gareth O Roberts, Andrew M Stuart, and David White. MCMC methods for functions: modifying old algorithms to make them faster. Statistical Science, pages 424–446, 2013. [2] T. Sell and S.S. Singh. Trace-class Gaussian priors for Bayesian learning of neural networks with MCMC. Under review. Arxiv E-print arXiv:2012.10943. [3] https://gym.openai.com/
Brief Description of the Project Deliverables: (i) A generic Matlab (potentially in collaboration with sponsor Mathworks) and Python implementation of our MCMC sampling algorithm for trace-class neural network priors [2] which can also be used more widely for other applications of Bayesian neural networks. (ii) Proof-of-concept automation implementations on exemplar tasks from AI Gym. (iii) Potential use of our Bayesian neural network sampling algorithm within the CUED curriculum (MEng level). Currently Bayesian neural networks are not taught, nor are they widely experimented with in the MEng projects. (iv) An outreach activity for years 5/6 school children in the field of Reinforcement learning/Automation via a Microbit implementation. ALL CODE WILL BE MADE PUBLIC.
References [1] T. Sell and S.S. Singh. Trace-class Gaussian priors for Bayesian learning of neural networks with MCMC. Under review. Arxiv E-print arXiv:2012.10943. [2] Simon L Cotter, Gareth O Roberts, Andrew M Stuart, and David White. MCMC methods for functions: modifying old algorithms to make them faster. Statistical Science, pages 424–446, 2013. [3] https://gym.openai.com/
Prerequisite Skills Statistics; Probability/Markov Chains
Other Skills Used in the Project Numerical Analysis ;PDE's; Mathematical Analysis
Programming Languages Python; MATLAB; Delieverable in MATLAB as required by sponsor.

AI for Coronary Artery Imaging

Project Title AI for Coronary Artery Imaging
Keywords Autoencoders, deep learning, python, imaging, computer vision
Contact Name Mike Roberts
Contact Email mr808@cam.ac.uk
Company/Lab/Department DAMTP / Cardiology
Address Department of Medicine / DAMTP
Period of the Project 8 weeks +
Work Environment In DAMTP or the Department of Medicine
Project Open to Undergraduates; Master's (Part III) students
Background Information Coronary arteries can be imaged using Optical Coherence Tomography imaging and show features of the artery from the inside allowing for identification of disease. These images are extremely high dimensional making many deep learning methods intractable.
Brief Description of the Project We will apply deep learning methods for image compression, encoding and reconstruction to encode high dimensional images to low dimensional representations. This then allows for downstream identification of diseased tissue, quantification and prediction of outcomes.
References  
Prerequisite Skills Image Processing
Other Skills Used in the Project  
Programming Languages Python