This is a list of the Academic CMP project proposals from summer 2022.
- Dynamical modelling of mutation as an immune defence
Keywords: Disease dynamics, virus evolution, computational modelling - SARS-CoV-2 pandemic-scale phylogenetics
Keywords: Phylogenetics, mathematical modeling, molecular evolution, epdiemiological modeling - Discovering regions of functional importance in RNA viruses
Keywords: Computational modelling, virus evolution, bioinformatics, mathematical biology - Speech enhancement for hearing devices: learned sound representations versus deterministic transforms
Keywords: deep learning, speech enhancement, frequency domain, auditory filterbanks, speech intelligibility - Decoding the Neural Signature of Speech Perception
Keywords: Cognitive neuroscience, brain-computer interface, speech perception, audio processing - Formalising Thom encoding in Isabelle/HOL
Keywords: Mechanising mathematics, proof assistant, real algebraic geometry, computer algebra - Representing plant hydraulics and plant water stress in a dynamic global vegetation model
Keywords: Water Stress Plants ABA Stomata - Climate Repair: Ice Thickening
Keywords: Arctic, ice, geoengineering, partial differential equations, numerical modelling - Integrating microvascular biophysics with graph neural networks
Keywords: Machine learning; biophysics; blood vessels; graph neural networks; modelling - Formalisation of material in number theory/additive combinatorics using Isabelle/HOL
Keywords: number theory, additive combinatorics, proof assistants, interactive theorem proving, Isabelle/HOL - Deep learning for microscopy image reconstruction
Keywords: deconvolution, deep learning, lightsheet microscopy, image reconstruction - Bayesian machine learning and theory in cosmology and particle physics
Keywords: Bayesian Inference; Machine Learning; Cosmology; - Novel Flexible Polyhedra
Keywords: geometry, computer programming - Quantum stability of a novel theory of gravity
Keywords: modified gravity, torsion, computer algebra, effective field theory, cosmology - R&D portfolio optimisation
Keywords: R&D portfolio, biopharma, monte carlo simulation, predictive analytics - Thawing frozen mutations in an ancient transmissible cancer
Keywords: Genomics, Cancer, Evolution, Mutational Signatures - Algebraic geometry and problem-based learning
Keywords: algebraic geometry, schemes, mathematical exposition, scientific writing, mathematics education - Formalising Modular Forms and Dirichlet Series in Isabelle/HOL
Keywords: number theory, modular forms, interactive theorem proving, type theory, Isabelle/HOL - Bayesian Learning for Automation
Keywords: Bayesian learning; Neural networks; MCMC; Reinforcement Learning - AI for Coronary Artery Imaging
Keywords: Autoencoders, deep learning, python, imaging, computer vision
Dynamical modelling of mutation as an immune defence
Project Title | Dynamical modelling of mutation as an immune defence |
Keywords | Disease dynamics, virus evolution, computational modelling |
Contact Name | Jordan Skittrall |
Contact Email | jps55@cam.ac.uk |
Company/Lab/Department | Department of Pathology |
Address | Division of Virology, Addenbrooke's Hospital, Cambridge, CB2 0QQ |
Period of the Project | 8 weeks |
Work Environment | The intention is that you will work in the Division of Virology, Department of Pathology, which is on the Addenbrooke's site in Cambridge. Should it be necessary to work remotely this will be possible, but the aim would be to work in person to maximise opportunities for discussion of ideas. Hours are flexible but you should expect to work standard working weeks in total. Jordan Skittrall (Pathology) will be the day-to-day supervisor and contact. There will be the opportunity to get to know other members of the division and to join any meetings taking place during the placement period. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | One form of immune response to viral infection involves deliberately mutating the virus to make it stop working. This is seen, for example, in HIV infection, where around an order of magnitude of the genetic diversity seen in viruses can be explained by this immune response, and where the virus makes a protein to defend against the response. It is also the mechanism by which some antiviral drugs work - including one of the drugs recently licensed for use in treating COVID-19. But it is a potentially risky strategy - mutations also allow viruses to escape immune responses and develop drug resistance. |
Brief Description of the Project |
The aim of this project is to develop a model to help us understand better how key parameters affect the balance between mutation helping the host, and mutation helping the virus. In particular, the aim is to understand to what extent both outcomes may occur in the same population if there is variability in the parameters within the population, and to estimate how often such outcomes occur. In the project, you will develop a simplified model capturing the key aspects of the mutation process, and implement and refine that model in one (or possibly more than one) infection scenario. To understand the parameter space developed, you will need to implement basic code to describe possible outcomes. This will be sufficiently straightforward that you can learn as you go if you have not done this before, but you should have basic familiarity with at least one programming language. Funding for this project is not guaranteed, and if interested in this project you are encouraged to make contact as soon as possible, as many funding opportunities require a named student in the application and earlier contact will allow us to apply before deadlines for more funders. You will not be required to respond to any project offer before the common deadline for the placements. |
References |
A background to the way the mutational arms race proceeds in HIV as an example (although we may work on a problem with a simpler setup) is given in the following references:
|
Prerequisite Skills | Statistics;Dynamical Systems |
Other Skills Used in the Project | Probability/Markov Chains;Data Visualization;Mathematical Biology |
Programming Languages | Python;MATLAB;Mathematica |
SARS-CoV-2 pandemic-scale phylogenetics
Project Title | SARS-CoV-2 pandemic-scale phylogenetics |
Keywords | Phylogenetics, mathematical modeling, molecular evolution, epdiemiological modeling |
Contact Name | Nicola De Maio |
Contact Email | demaio@ebi.ac.uk |
Company/Lab/Department | EMBL-EBI |
Address | The European Bioinformatics Institute (EMBL-EBI) Wellcome Genome Campus Hinxton, Cambridgeshire, CB10 1SD, United Kingdom |
Period of the Project | 8 weeks or preferentially more |
Work Environment | The student will be supervised day by day by a senior postdoc in the group, interact regularly with 2 other PhD students in the group, attend weekly group meetings, and be supervised weekly by the group leader. |
Project Open to | Undergraduates;Master's (Part III) students |
Background Information | The COVID-19 pandemic has been accompanied by an extremely intense and widespread sequencing effort, resulting in millions of SARS-CoV-2 genomes available to researcher to help investigate the virus spread and evolution. However, existing data analysis and mathematical modeling methods struggle to deal with this mole of data. |
Brief Description of the Project | The students will contribute to methods developed in our lab to efficiently and acurately investigate SARS-CoV-2 genome data, and, more generally, will work on phylogenetic methods that infer evolutionary history from DNA sequence data. The project will involve the development of efficient algorithms and mathematical models and their implementation, either in Python, C++ or Java. The project can involve the development of Markov model for phylogenetics or the use of neural networks for molecular evolution. The ideal outcome would be the development of novel mathematical models to acurately describe complex scenarios of DNA evolution in a computationally efficient manner. |
References | |
Prerequisite Skills | Probability/Markov Chains;Programming, preferentially in Python, but Java and C++ would also be extremely useful. |
Other Skills Used in the Project | Statistics;Simulation |
Programming Languages | Python;C++;Java |
Discovering regions of functional importance in RNA viruses (13 February Deadline)
Project Title | Discovering regions of functional importance in RNA viruses |
Keywords | Computational modelling, virus evolution, bioinformatics, mathematical biology |
Contact Name | Jordan Skittrall |
Contact Email | jps55@cam.ac.uk |
Company/Lab/Department | Department of Pathology |
Address | Division of Virology, Addenbrooke's Hospital, Cambridge, CB2 0QQ |
Period of the Project | 8 weeks |
Work Environment | The intention is that you will work in the Division of Virology, Department of Pathology, which is on the Addenbrooke's site in Cambridge. Should it be necessary to work remotely this will be possible, but the aim would be to work in person to maximise opportunities for discussion of ideas. Hours are flexible but you should expect to work standard working weeks in total. |
Project Open to | Undergraduate students only |
Background Information | The genetic sequences of all organisms we seen can be viewed as the results of a natural experiment in what is capable of surviving. By examining the sequences of these organisms we can analyse the results of that natural experiment. Viruses that replicate using RNA (a large proportion of the viruses we know) give some of the smallest examples of such sequences, making their analysis computationally tractable. By detecting regions of such viruses that have to be conserved for the virus to survive, we get pointers to elements of the viral lifecycle, and to possible drug targets. |
Brief Description of the Project |
This project would be especially suitable for somebody who wanted to explore moving into bioinformatics, or applying mathematical knowledge to microbiology. It will be advertised to both mathematicians and biologists, and the focus for a mathematician will be on learning the skills in handling biological data and in virology required to undertake an analysis and start interpreting the results. In this project you will apply to a set of virus genomes one of our mathematical techniques for searching for regions of interest. You will need to:
There are a few options for viruses it would be possible to work with on this project, which can be discussed prior to application. The project is likely to focus on a virus capable of causing human disease for which drug treatment is sought. The background work underpinning this project stretches all the way from open problems in probability theory, through the mathematics of signal processing and bioinformatics, to wet lab molecular biology and clinical applications. It is a requirement of the funding stream we propose to seek for this project that the project be strongly microbiological in nature and the project background be in microbiology in such a way that students undertaking it will be exposed to core microbiology thinking. Funding for this project will be sought via application to an external body (and so cannot be guaranteed). The deadline for the external body's funding applications necessitates an earlier deadline of *13th February 2022* for expressions of interest, although earlier expressions of interest will make funding application more straightforward. |
References |
These references are listed in priority order for background reading (the first two are key), but if you have time, the progression of ideas is slightly easier to follow if they are read chronologically. There is no need to have read the references prior to sending an enquiry about the project: they are included in case you would like more information. The references describe the mathematical underpinning of the techniques used, and demonstrate some previous applications of those techniques. [1] “A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data' Julia R. Gog, Andrew M.L. Lever, Jordan P. Skittrall. PLoS ONE, 2018 13(4):e0195763. (https://doi.org/10.1371/journal.pone.0195763) |
Prerequisite Skills | |
Other Skills Used in the Project | Statistics;Database Queries;Data Visualization |
Programming Languages | Mathematica, small amounts of bash (it's fine to start the project with no knowledge of these) |
Speech enhancement for hearing devices: learned sound representations versus deterministic transforms
Project Title | Speech enhancement for hearing devices: learned sound representations versus deterministic transforms |
Keywords | deep learning, speech enhancement, frequency domain, auditory filterbanks, speech intelligibility |
Contact Name | Clément Gaultier, Tobias Goehring |
Contact Email | Clement.Gaultier@mrc-cbu.cam.ac.uk |
Company/Lab/Department | Deep Hearing Lab, MRC Cognition and Brain Sciences Unit, University of Cambridge |
Address | Clement.Gaultier@mrc-cbu.cam.ac.uk |
Period of the Project | Any 8-week period from late June to early September |
Work Environment | The student will be part of the Deep Hearing Lab (1) based at the MRC Cognition and Brain Sciences Unit (2) and work with primary supervisor Dr. Clément Gaultier as well as secondary supervisor Dr. Tobias Goehring. The student will benefit from joining the Cambridge Hearing Group (3), a world-leading and vibrant research network in Cambridge. The candidate will have the opportunity to use applied mathematics for signal processing and learn basics of time-frequency sound analysis/synthesis. Remote working can be arranged if conditions make it difficult or not suitable to come to the office daily (depending on the evolving COVID situation). (1): Deep Hearing Lab: https://www.deephearinglab.com (2): MRC CBSU: https://www.mrc-cbu.cam.ac.uk (3): Cambridge Hearing Group: https://www.hearing-research.group.cam.ac.uk |
Project Open to | Undergraduates;Master's (Part III) students |
Background Information | This project is part of a larger multidisciplinary project studying new speech enhancement strategies with the aim to improve sound perception for people using hearing devices (hearing aids, cochlear implants) in challenging listening situations with noise and reverberation. |
Brief Description of the Project |
Deep learning brought substantial improvements to speech recognition, enhancement or separation systems by estimating a time-frequency mask (by a masking network) in a deterministic transformed domain for time-domain sound signals (i.e. Fourier domain). The masked representation is then transformed back to a time-domain signal (i.e. Inverse Fourier transform) to yield the enhanced sound. In recent years, new artificial neural network architectures (Recurrent Neural Networks, Attention mechanisms) along with the introduction of end-to-end learning systems provided a significant boost of performance [1]. These new techniques work in an Encoder-Mask-Decoder fashion where the mask is no longer estimated from a fixed transformed domain but from learned representations available at the encoder stage. At the decoder stage, the masked representations are then transformed back to a time-domain signal ("decoded") to obtain the enhanced sound signals. This project will explore the impact of such learned representations over deterministic transforms (auditory inspired, discrete Fourier transforms, ....) in the context of noise and reverberation compensation for hearing impaired people that use hearing aids or cochlear implants. The student will investigate the following research questions (but not limited to): The outcomes of the study will be systematically evaluated using objective measures and pilot listening tests and results will contribute to an ongoing research project. The student will use common high-level (Python) deep learning frameworks for speech separation/enhancement including filterbank design tools for comparing the encoder-decoder models and deterministic transforms [2]. |
References | [1]: Luo, Yi, and Nima Mesgarani. 'TasNet: time-domain audio separation network for real-time, single-channel speech separation.' 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018. [2]: Pariente, Manuel, et al. 'Asteroid: the pytorch-based audio source separation toolkit for researchers.' arXiv preprint arXiv:2005.04132 (2020). |
Prerequisite Skills | Statistics;Algebra/Number Theory;The candidate should have basic knowledge in a programming language and be keen to learn or use Python and high-level signal processing tools. Basic knowledge of statistics and linear algebra. Interests in acoustics or speech processing is a plus. |
Other Skills Used in the Project | |
Programming Languages | Python;MATLAB;R;C++;No Preference |
Decoding the Neural Signature of Speech Perception
Project Title | Decoding the Neural Signature of Speech Perception |
Keywords | Cognitive neuroscience, brain-computer interface, speech perception, audio processing |
Contact Name | Tobias Goehring and/or Alexis Deighton MacIntyre |
Contact Email | alexisdeighton.macintyre@mrc-cbu.cam.ac.uk |
Company/Lab/Department | Deep Hearing Lab, MRC Cognition and Brain Sciences Unit, University of Cambridge |
Address | alexisdeighton.macintyre@mrc-cbu.cam.ac.uk |
Period of the Project | 8 weeks |
Work Environment | The student will work together with one primary supervisor (Alexis Deighton MacIntyre/Post Doc) but will also benefit from close cooperation with other members of a friendly and vibrant lab, as well as connections within the broader Cambridge Hearing Group community, a multidisciplinary affiliation spanning brain sciences, medicine, and engineering, and the MRC-CBU, a world-leading research centre in basic and translational cognitive neuroscience. For more information, see https://www.hearing-research.group.cam.ac.uk/ and https://www.mrc-cbu.cam.ac.uk/ |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | Electroencephalography (EEG) is a method to record brain activity in the form of electrical signals at the scalp. Recent analytical advances allow us to infer or reconstruct aspects of subjective, auditory experiences, such as speech perception, using a listener's neural data alone. This development may hold promise for applications in brain-computer interfaces, such as smart hearing aids or cochlear implants that aim to optimise auditory perception for people with hearing loss. |
Brief Description of the Project | Various techniques exist to correlate external stimuli, like acoustic recordings of speech, with EEG data. Some popular approaches include mutual information (MI) analysis and the fitting of temporal response functions (TRF) using regularised linear regression. One problem is that the resulting information and/or correlation values, though statistically robust, tend to be very small in magnitude, suggesting room for improvement. It may be that the specific choice of ground truth (e.g., engineered acoustic features) is inappropriate, or that the strength of the correspondence between stimulus and neural response varies over time, in which case a temporally dynamic approach may be preferable. Finally, data-driven models derived with machine learning (ML) may complement and/or surpass the techniques described above, which assume a simple stimulus-to-brain mapping. Using EEG data from human listeners, the student's role will entail the systematic comparison of different input acoustic features, whilst taking the form of analysis (e.g., MI, TRF) as well as temporal factors (e.g., discrete versus continuous sampling, effects of window length) into account. The objective is to determine the impact of choice of feature on the overall measure of correspondence between acoustic stimuli and brain signals. The results of this project will contribute to a bigger, ongoing project and inform optimal decoding approaches with applications in hearing research. Depending on the student's interest, there is also some opportunity to explore ML-based, non-linear methods and comparing them to more established, but constrained linear approaches. |
References | Crosse, M. J., Di Liberto, G. M., Bednar, A., & Lalor, E. C. (2016). The multivariate temporal response function (mTRF) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in human neuroscience, 10, 604. Ding, N., & Simon, J. Z. (2014). Cortical entrainment to continuous speech: functional roles and interpretations. Frontiers in human neuroscience, 8, 311. |
Prerequisite Skills | Statistics;Mathematical Analysis;Predictive Modelling |
Other Skills Used in the Project | Machine learning, biomedical data analysis |
Programming Languages | Python;MATLAB |
Formalising Thom encoding in Isabelle/HOL
Project Title | Formalising Thom encoding in Isabelle/HOL |
Keywords | Mechanising mathematics, proof assistant, real algebraic geometry, computer algebra |
Contact Name | Wenda Li |
Contact Email | wl302@cam.ac.uk |
Company/Lab/Department | Department of Computer Science and Technology, University of Cambridge |
Address | William Gates Building JJ Thomson Avenue Cambridge. CB3 0FD |
Period of the Project | 8 weeks |
Work Environment | The student will mainly collaborate with me, but he/she is also welcome to chat with others in the ALEXANDRIA group (https://www.cl.cam.ac.uk/~lp15/Grants/Alexandria/). There is a possibility that the project will be remote depending on the situation of the pandemic. |
Project Open to | Undergraduates;Master's (Part III) students |
Background Information | Modern proof assistants allow users to interact with computers to mechanise mathematical theorems and their proofs. Here, all derivation steps will be mechanically checked, so that ambiguities and errors in normal hand-written proofs can be eliminated. Recent progress in proof assistants includes Grothendieck's Schemes in the Isabelle proof assistant [1] and some recent results from Peter Scholze being mechanised in the Lean proof assistant [2]. This project will involve doing mechanised proofs within the Isabelle proof assistant. |
Brief Description of the Project | Real algebraic numbers are usually encoded as an univariate integer polynomial P and an interval (with rational end points) such that there is exactly one root of P within this interval. However, this encoding is not sufficient in the field of rationals extended with real algebraic numbers and infinitesimals, since a polynomial with infinistesimal coefficients can have roots that can be isolated with a rational interval. To addresss this problem, we may need Thom encoding to distinguish polynomial roots in a non-archimedean field as has been implemented in the Z3 SMT solver [4]. In the project, the goal is to formalise the fundamental property of Thom encoding in Isabelle/HOL (Proposition 2.28. in Bochnak, Coste and Roy [3]). |
References | [1] Bordg, Anthony, Lawrence Paulson, and Wenda Li. "Simple Type Theory is not too Simple: Grothendieck's Schemes without Dependent Types." arXiv preprint arXiv:2104.09366(2021). [2] Castelvecchi, Davide. "Mathematicians welcome computer-assisted proof in'grand unification'theory." Nature (2021). [3] Bochnak, J., Coste, M. and Roy, M. F. (2013). Real algebraic geometry (Vol. 36). Springer. [4] Grant Passmore and Leonardo de Moura {2013}. Computation in Real Closed Infinitesimal and Transcendental Extensions of the Rationals. Proceedings in the 24th International Conference on Automated Deduction (CADE-24). |
Prerequisite Skills | Mathematical Analysis;Prior knowledge of proof assistants (e.g., Coq, Lean, Isabelle) is preferred but not required |
Other Skills Used in the Project | |
Programming Languages | Isabelle |
Representing plant hydraulics and plant water stress in a dynamic global vegetation model
Project Title | Representing plant hydraulics and plant water stress in a dynamic global vegetation model |
Keywords | Water Stress Plants ABA Stomata |
Contact Name | Prof. Andrew D. Friend |
Contact Email | adf10@cam.ac.uk |
Company/Lab/Department | Department of Geography |
Address | Department of Geography, University of Cambridge, Downing Place, Cambridge CB2 3EN |
Period of the Project | 8 weeks summer 2022 |
Work Environment | The project is set up to complement current work in our research group and give the student a realistic research experience, so there is some room to customize the project based on the student's interests and skills. The project will be performed using MATLAB (or similar programming language) for the data analysis and FORTRAN for the implementation of sub models in HYBRID. Model runs will be carried out on the Cambridge CSD3 cluster. The student will work closely with a PhD student. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | In a changing climate the global hydrological cycle will alter significantly (Abbott et al., 2019) and drought conditions will become more frequent and intermittent (Greve et al., 2019; Pokhrel et al., 2021; Samaniego et al., 2018). Under these conditions it is critical for plants to manage their water economy efficiently. Modeling plant water stress is a key component of dynamic global vegetation models (Bonan et al., 2014; Eller et al., 2020; Kennedy et al., 2019) as it is necessary to correctly predict plant productivity and mortality during periods of drought (Stocker et al., 2019; Trugman et al., 2018). |
Brief Description of the Project | In this project we would like you to work on implementing alternative stomatal conductance models and water stress functions into the HYBRID vegetation model, as well as further representing plant signaling pathways that lead to stomatal closure, regulating water loss. One signaling pathway that could be included is abscisic acid (ABA) concentration, which is a well-known driver of stomatal closure. This would include creating a sub-model for ABA synthesis, transport, and sequestration, as well as the changing sensitivity of stomatal conductance to ABA concentration (could be based on the work of Dewar, 2002). Furthermore, resulting plant water fluxes during drought can be compared and analyzed in respect to microwave observations of vegetation water content. |
References |
|
Prerequisite Skills | Predictive Modelling |
Other Skills Used in the Project | |
Programming Languages | MATLAB; Fortran |
Climate Repair: Ice Thickening
Project Title | Climate Repair: Ice Thickening |
Keywords | Arctic, ice, geoengineering, partial differential equations, numerical modelling |
Contact Name | Katie Parker (Project Manager), Professor Hugh Hunt (Supervisor) |
Contact Email | kvp24@dow.cam.ac.uk |
Company/Lab/Department | Centre for Climate Repair / Engineering |
Address | Centre for Climate Repair, Downing College |
Period of the Project | 8-10 weeks at a date to be agreed between June and Sept |
Work Environment | The project can be undertaken remotely or in person, subject to discussion at interview. The team typically work office hours (9-5) but are flexible. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | The Arctic is melting fast. The Centre for Climate Repair (CCRC) is looking at technologies that might slow down or reverse this melting. One idea is to spray seawater onto existing ice during the cold winter thereby thickening it so that it will last through the Arctic summer. A fourth-year engineering student has developed ice-thickening experiments and has made interesting measurements with water flowing in a channel inside a freezer at -18°C, but there is a need for a mathematical model to enable the measurements to be properly interpreted. |
Brief Description of the Project | This project is ideally suited to an applied mathematician or engineer who is comfortable with partial differential equations and numerical methods. The balance between heat transfer, sensible heat and latent heat in flowing water is formulated in terms of distance and time. As if this isn't complicated enough, salt water as it freezes develops a salt-concentration gradient which needs to be included. If we have a good model, then we can use it to design methodologies for creating ice in the Arctic. |
References | Peter Wadhams 'A Farewell to Ice''; Centre for Climate Repair website, especially working papers - https://www.climaterepair.cam.ac.uk/working-papers; a couple of papers from last year's internships available on request from kvp24@dow.cam.ac.uk |
Prerequisite Skills | Numerical Analysis; Mathematical Analysis; Predictive Modelling |
Other Skills Used in the Project | |
Programming Languages | Python; MATLAB |
Integrating microvascular biophysics with graph neural networks
Project Title | Integrating microvascular biophysics with graph neural networks |
Keywords | Machine learning; biophysics; blood vessels; graph neural networks; modelling |
Contact Name | Dr Paul Sweeney |
Contact Email | Paul.sweeney@cruk.cam.ac.uk |
Company/Lab/Department | Bohndiek Lab, Cancer Research UK Cambridge Institute, University of Cambridge |
Address | Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE |
Period of the Project | 8 weeks (flexible) |
Work Environment | Lab based - hours typically 10am - 4pm. Lab consists of several post-docs and PhD students. Can flexibly work from home at times. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | Many real-world datasets can be define in the form of a graph, for example, social networks, protein interactions and road networks. An active area of interest in machine learning are graph neural networks (GNNs) which operate on graph data. Subsequent application of GNNs has led to progress in fake news detection, antibacterial discovery and traffic detection. Biomedical imaging can extract information on the structure of large microvascular networks (10^6 vessels) which can be used as input into mathematical models to investigate biological transport phenomena. These networks of blood vessels can also be represented graphs and so are amenable to GNNs. Due to the ever increasing size of these datasets, as a result of advances in imaging, it would be interesting to see if GNNs can rapidly generate accurate predictions relating to the inherent structure of these networks, in addition to predicting more complex biophysics. |
Brief Description of the Project | The aim of the project will be for the student to build their own graph neural network in Python (using standard APIs e.g., Tensorflow / Keras) to form useful predictions relating to properties of vascular networks. Students will generate their own synthetic vascular graphs using existing software, to build a library of synthetic data, as well as utilise existing blood vessel structural datasets obtained via biomedical imaging. These data could be used to model blood flow using existing packages (C++, no experience needed) as well act as input to their GNN. Initially the student will develop their GNN for undirected graphs to predict basic properties of the graph (e.g., path of least resistance). Next the GNN will be developed to incorporate directional graphs to be trained against their blood flow simulations, enabling their GNN to predict biophysical properties for any arbitrary graph used as input. A successful project is deemed as one where the student gains competence and confidence in coding, model design and application. The project is open-ended, evolving as news ideas arise and dependent on student progress. |
References |
|
Prerequisite Skills | Simulation; Predictive Modelling; A basic understanding of machine learning & coding experience |
Other Skills Used in the Project | Statistics; Mathematical physics; PDE's |
Programming Languages | Python; C++ |
Formalisation of material in number theory/additive combinatorics using Isabelle/HOL
Project Title | Formalisation of material in number theory/additive combinatorics using Isabelle/HOL |
Keywords | number theory, additive combinatorics, proof assistants, interactive theorem proving, Isabelle/HOL |
Contact Name | Dr. Angeliki Koutsoukou-Argyraki |
Contact Email | ak2110@cam.ac.uk |
Company/Lab/Department | University of Cambridge, Department of Computer Science and Technology (Computer Laboratory) |
Address | 15 JJ Thomson Avenue CB30FD Cambridge |
Period of the Project | 8 weeks |
Work Environment | ALEXANDRIA group, please see https://www.cl.cam.ac.uk/~lp15/Grants/Alexandria/. We will be working remotely. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | |
Brief Description of the Project | The student will participate in a project involving the formalisation of material in number theory/additive combinatorics using the proof assistant (interactive theorem prover) Isabelle/HOL. |
References | Recent related work: |
Prerequisite Skills | Mathematical Analysis; Algebra/Number Theory |
Other Skills Used in the Project | Previous experience in Isabelle/HOL or other proof assistants is desirable but not necessary. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html |
Programming Languages | Isabelle/HOL. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html |
Deep learning for microscopy image reconstruction
Project Title | Deep learning for microscopy image reconstruction |
Keywords | deconvolution, deep learning, lightsheet microscopy, image reconstruction |
Contact Name | Leila Muresan |
Contact Email | lam94@cam.ac.uk |
Company/Lab/Department | Dept. of Physiology, Development and Neuroscience |
Address | Anatomy building, CB2 3DY |
Period of the Project | 8 weeks |
Work Environment | The student will be part of an on-going collaboration between , MRC-LMB (Jerome Boulanger), DAMTP (Yury Korolev) and Quantitative Biology Institute, Yale University (Bogdan Toader) and PDN (Leila Muresan). There will be weekly meetings of the entire team (possibly online), otherwise the schedule and location arrangements are flexible. |
Project Open to | Master's (Part III) students |
Background Information | In recent years. machine learning and especially deep-learning techniques had a huge impact on microscopy image analysis. For instance, the solutions of difficult segmentation tasks were hugely improved for both 2d and 3d data, and several image reconstruction and denoising methods have been designed that currently constitute the state of the art. Deep learning has also been used to model the data for single molecule localization microscopy providing insightful forward models. |
Brief Description of the Project | This project will focus on combining learned regularization and forward models for solving deconvolution problems in a mathematically coherent manner. The goal is to exploit the flexibility of learned models and the high performance of learned regularizers over traditional regularization methods while maintaining mathematical guarantees for the reconstructed images. (The goal of the project is open-ended, the candidate will contribute to ongoing work on lightsheet microscopy deconvolution). |
References | |
Prerequisite Skills | |
Other Skills Used in the Project | Numerical Analysis; Image processing; Simulation |
Programming Languages | No Preference |
Bayesian machine learning and theory in cosmology and particle physics
Project Title | Bayesian machine learning and theory in cosmology and particle physics |
Keywords | Bayesian Inference; Machine Learning; Cosmology; |
Contact Name | Dr Will Handley |
Contact Email | wh260@cam.ac.uk |
Company/Lab/Department | Kavli Institute for Cosmology/Cavendish Laboratory |
Address | Kavli Institute for Cosmology/Cavendish Laboratory |
Period of the Project | 8-12 weeks (depending on funding) |
Work Environment | Working with my postdocs and PhD students in the KICC |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | |
Brief Description of the Project |
In this project the student will work with Dr Handley and his team investigating the development and application of Bayesian machine learning techniques to modern and future cosmological and particle physics datasets. The precise details of the project will be tailored to the student interest and skill set, but possible topics/projects include Over the course of the project students can expect to learn some/all of:
Essential:
Desirable:
|
References | |
Prerequisite Skills | |
Other Skills Used in the Project | |
Programming Languages | Python; Mathematica or Maple |
Novel Flexible Polyhedra
Project Title | Novel Flexible Polyhedra |
Keywords | geometry, computer programming |
Contact Name | Simon Guest |
Contact Email | sdg@eng.cam.ac.uk |
Company/Lab/Department | Engineering Department |
Address | sdg@eng.cam.ac.uk |
Period of the Project | 8 weeks |
Work Environment | The student will be part of a small group in the Civil Engineering Building in West Cambridge. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | Cauchy showed that all convex polyhedra are rigid, but wasn't until the 1970s that Bob Connelly found a non-convex polyhedron that was flexible. A related result is Alexandrov's uniqueness theorem, that shows that any polyhedron with a given metric (i.e. that can be folded from a net) has a unique convex realisation - a constructive proof of this was only found by Bobenko and Izmestiev in 2008. These results are closely connected to recent work in rigid origami, and a better understanding of the connection between them might allow us to develop novel foldable structures, for instance for use in spacecraft. |
Brief Description of the Project | The project will start by developing a computer implementation of Bobenko and Izmestiev's algorithm for finding convex realisations for polyhedra, and then examine whether this algorithm can be modified to understand better the behaviour of (clearly non-convex) flexible polyhedra. This might give us insight to develop new flexible polyhedra, for instance novel forms that are not fully triangulated. |
References | Much of the background is given in the recent book 'Frameworks, Tensegrities, and Symmetry' published by CUP, and available electronically from the University Library. |
Prerequisite Skills | Geometry/Topology |
Other Skills Used in the Project | |
Programming Languages | Python; MATLAB |
Quantum stability of a novel theory of gravity
Project Title | Quantum stability of a novel theory of gravity |
Keywords | modified gravity, torsion, computer algebra, effective field theory, cosmology |
Contact Name | Will Barker |
Contact Email | wb263@cam.ac.uk |
Company/Lab/Department | Cavendish astrophysics & KICC |
Address | Office K34, Kavli Institute for Cosmology, Cambridge, CB3 0HA |
Period of the Project | 8 weeks |
Work Environment | The student may work remotely, but would ideally have a desk at the Kavli Institute for Cosmology, Cambridge (KICC). We have a small modified gravity nexus belonging to the Cavendish Astrophysics Group, comprising Professors Lasenby and Hobson and myself. The broader environment in the Kavli is dominated by theoretical, observational and statistical cosmologists. The student would be encouraged to take advantage of seminars and networking both at the Kavli and the CMS. There are also opportunities to liaise remotely with astroparticle theorists at CEICO in Prague and at the Instituut Lorentz in Leiden. We have free coffee. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information |
Einstein's General Relativity (GR) remains the preferred effective theory of gravity as spacetime curvature, explaining the orbital precession of Mercury and solar bending of starlight while underpinning modern cosmology. However GR does not explain dark matter or dark energy, while an alleged `Hubble tension' indicates that our Universe is expanding 10% faster than it should be. And of course, GR continues to stubbornly resist attempts at (complete) quantum reformulation. We recently attracted some attention by proposing an alternative theory of gravity as a blend of spacetime torsion and curvature: this appears to provide a cosmological constant and alleviate the Hubble tension. Our Lagrangian is wildly different from that of GR, with a quantum structure suggestive of renormalisability. We believe the theory adopts a torsion vacuum expectation value (VEV) at a primordial epoch, on the back of which the good classical phenomena emerge. However many quantum/classical aspects of this torsion VEV, and the violent early-Universe physics of its formation, remain shrouded in mystery... |
Brief Description of the Project |
The findings of the project may debunk our theory or, if we are lucky, propel it further into the spotlight of community interest. The student may wish to target one of several new fronts we are opening in our research campaign: 1) *Quantum stability and the infrared* -- This is an urgent question; we don't really know if the torsion VEV is stable against quantum fluctuations, as is the case for the Minkowski vacuum of GR. The student will apply well established effective field theory and ghost condensate techniques to characterise the infrared environment of the VEV. Extensions to the ultraviolet are of course welcome depending on expertise, though expected to be more challenging. A stable vacuum is quite a big deal, while convincing instabilities would seem to rule our theory out: either way this avenue promises high returns. 2) *Cosmological perturbation theory* -- There is a very well established theory dictating how cosmic density perturbations evolve under gravity, which supports GR to amazing precision based on tiny anisotropies in the cosmic microwave background and the clustering of matter on the grandest scales. The student will extract and characterise the classical perturbation equations (and perhaps convenient gauges) around the torsion VEV, matching against GR where possible. This also targets aspects of the infrared environment, but uses classical methods so does not require prior knowledge of QFT. Apart from offering a neat standalone stability test the perturbation theory will facilitate, in the long run (2023), sophisticated Monte Carlo tests against cosmological survey data: to this end a successful student would also have a stake in these future research way points. 3) *Primordial symmetry breaking* -- We imagine that the Big Bang left our gravity theory in a torsionless conformal phase, bathed in the standard model plasma. So how and when does the torsion VEV form in relation to the condensation of the Higgs field? Could this process have driven inflation, the violent expansion thought to have occurred in the very early Universe? A decaying deviation from the torsion VEV just after inflation can alleviate the Hubble tension: what physics sets this initial condition? The student may wish to merely explore these questions using the background cosmology equations, and there is a viable research-grade project at this level. However depending on interest/experience in electroweak symmetry breaking or effective quantum theories of inflation, we may hope to propose a novel inflationary mechanism. These topics are not exhaustive and are subject to shift as we study the theory throughout spring 2022. |
References |
|
Prerequisite Skills | Mathematical physics; PDEs; familiarity with GR and (very) introductory cosmology |
Other Skills Used in the Project | Data Visualization; QFT and cosmological perturbation theory are a bonus according to chosen topic. |
Programming Languages | Python; Mathematica (you can get a free license from Maths dept.!), maybe Maple if you prefer it. |
R&D portfolio optimisation
Project Title | R&D portfolio optimisation |
Keywords | R&D portfolio, biopharma, monte carlo simulation, predictive analytics |
Contact Name | Nektarios Oraiopoulos |
Contact Email | no245@cam.ac.uk |
Company/Lab/Department | Judge Business School |
Address | Trumpington Street, CB2 1AG |
Period of the Project | 8-10 weeks |
Work Environment | Student will work mostly on their own and will have regular meetings with academic supervisor. Working remotely is fine. Would be good to have some meetings face to face. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | Managing the R&D pipeline in a biopharmaceutical company is one of the most significant challenges in the industry. Decision have to be made regarding what projects should be advanced in the next stage (and therefore consume significant financial resources of the company) and that projects should be put on-hold or terminated. Those decisions are made under significant uncertainty: each project has a likelihood of success, and specifically estimates rates of false positive (the current data look promising, but actually the project is doomed to fail) or false negatives (the current data look weak, but actually the project will work, if given resources). |
Brief Description of the Project | The student will work closely with Dr Oraiopoulos (https://www.jbs.cam.ac.uk/faculty-research/faculty-a-z/nektarios-oraiopo...) to develop an R&D portfolio optimisation model. The optimisation model will have as inputs estimates regarding the cost of each project, the false positive and negative rates, the commercial potential, etc. and it will calculate (and visualize) the expected reward and risk of different portfolios (allowing the decision-maker to select the most promising one). E.g., the model might suggest that the decision-maker should select only 6 out of the 10 current projects. A key characteristic of the model should be that it compares portfolios of projects rather than single projects. The student might also be given access to large datasets that would allow her/him to estimate those false positive/negative rates using predictive models. The student will also receive feedback from experienced executives from the pharmaceutical industry that have created drugs that transformed the industry. |
References | |
Prerequisite Skills | Statistics; Probability/Markov Chains; Simulation; Predictive Modelling; Data Visualization |
Other Skills Used in the Project | |
Programming Languages | Python; MATLAB; R |
Thawing frozen mutations in an ancient transmissible cancer
Project Title | Thawing frozen mutations in an ancient transmissible cancer |
Keywords | Genomics, Cancer, Evolution, Mutational Signatures |
Contact Name | Kevin Gori |
Contact Email | kcg25@cam.ac.uk |
Company/Lab/Department | Department of Veterinary Medicine |
Address | Department of Veterinary Medicine, Madingley Road, Cambridge, CB3 0ES |
Period of the Project | 20 June - 19 August (some flexibility) |
Work Environment | The student will work as part of the Transmissible Cancer group, based at the Department of Veterinary Medicine. Preferably they will spend at least three days a week in the lab. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | The project will focus on studying transmissible cancer, which is a rare class of cancer that has developed the ability to infect new hosts. The cancer in question is Canine Transmissible Venereal Tumour, which affects dogs worldwide. The project aims to use genomic sequences of CTVT to examine early events that took place in its many-centuries long evolution, that may help to explain how it arose in the first place. |
Brief Description of the Project |
This project will be based at the Department of Veterinary Medicine, and will involve genomic analysis of the canine transmissible venereal tumour (CTVT). CTVT is a transmissible cancer, which is a rare class of cancer that has developed the ability to infect new hosts, and behaves rather like a parasitic organism. CTVT is spread among dogs when they come in direct physical contact with tumour tissue infecting another individual, usually during mating. CTVT is by far the oldest clonally reproducing cancer known on Earth. The project will build on recent work done in our lab to unravel the earliest events that befell the tumour in its progression towards becoming infectious and globally endemic. Using high coverage DNA sequencing information from several CTVT samples, as well as from the uninfected tissue of their hosts, we have previously estimated the evolutionary tree that relates these tumours. This work has identified a previously unseen mutational signature (‘signature A’) that was active during the early evolution of CTVT, but was later switched off. In this project we will take estimates of genomic copy number in our samples, and use these to find genomic regions that have been duplicated in CTVT’s development. From these duplications we can identify ‘frozen time points’: fragments of DNA sequence that were gained during the early evolution of the cancer. The relative ages of these fragments will be estimated by the degree to which they have accumulated new mutations. Combined with this timing information, by examining the fragments for the presence of signature A we will be able to determine whether signature A occurred continuously, or in bursts. Additionally, the earliest fragments will be highly representative of the genotype of the animal in which CTVT arose, illuminating perhaps the characteristics of the earliest domesticated dogs. |
References | Strakova, Andrea, and Elizabeth P. Murchison. 2015. “The Cancer Which Survived: Insights from the Genome of an 11000 Year-Old Cancer.” Current Opinion in Genetics & Development 30: 49–55. Baez-Ortega, Adrian, Kevin Gori, Andrea Strakova, Janice L. Allen, Karen M. Allum, Leontine Bansse-Issa, Thinlay N. Bhutia, et al. 2019. “Somatic Evolution and Global Expansion of an Ancient Transmissible Cancer Lineage.” Science 365 (6452): eaau9923. Leathlobhair, Máire Ní, Angela R. Perri, Evan K. Irving-Pease, Kelsey E. Witt, Anna Linderholm, James Haile, Ophelie Lebrasseur, et al. 2018. “The Evolutionary History of Dogs in the Americas.” Science 361 (6397): 81–85. |
Prerequisite Skills | Statistics; Data Visualization |
Other Skills Used in the Project | Probability/Markov Chains; Simulation |
Programming Languages | Python; R |
Algebraic geometry and problem-based learning
Project Title | Algebraic geometry and problem-based learning |
Keywords | algebraic geometry, schemes, mathematical exposition, scientific writing, mathematics education |
Contact Name | Anthony Bordg |
Contact Email | apdb3@cam.ac.uk |
Company/Lab/Department | Department of Computer Science and Technology |
Address | William Gates Building, JJ Thomson Avenue, Cambridge CB3 0FD |
Period of the Project | 4 weeks |
Work Environment | The student will work with Anthony Bordg. Remote work possible. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | |
Brief Description of the Project | This project will deal with issues in teaching and learning a very abstract field of mathematics: algebraic geometry. The point of view of a student is desirable and will be valued. Together with Anthony Bordg the student will work on completing an exposition of schemes in algebraic geometry that intends to fill a wide gap in the literature between nontechnical presentations and advanced textbooks. The final goal is the publication of an expository article on the topic in a mathematics journal, e.g. Emergent Scientist, Rocky Mountain Journal of Mathematics ... The student will be given extensive guidance and training in mathematical writing. |
References | Anthony Bordg, "What is a Scheme in Algebraic Geometry? A Problem-Oriented Approach", https://drive.google.com/file/d/19hsesOZl70hmzYxcV_OgINOgIg2SBD1q/view |
Prerequisite Skills | Geometry/Topology; Algebra/Number Theory; algebraic geometry |
Other Skills Used in the Project | |
Programming Languages |
Formalising Modular Forms and Dirichlet Series in Isabelle/HOL
Project Title | Formalising Modular Forms and Dirichlet Series in Isabelle/HOL |
Keywords | number theory, modular forms, interactive theorem proving, type theory, Isabelle/HOL |
Contact Name | Anthony Bordg |
Contact Email | apdb3@cam.ac.uk |
Company/Lab/Department | Department of Computer Science and Technology |
Address | William Gates Building, JJ Thomson Avenue, Cambridge CB3 0FD |
Period of the Project | 8 weeks |
Work Environment | Your supervisor will be Anthony Bordg, but you will interact with the whole ALEXANDRIA team led by Prof. Larry Paulson. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | |
Brief Description of the Project | You will work on a formalisation in Isabelle/HOL of Apostol's textbook "Modular Functions and Dirichlet Series in Number Theory". The main definitions and statements have already been formalised (see the GitHub repo in References), hence you will focus on proving these statements with the help of Isabelle/HOL efficient automation. This could lead to a pioneering and high-impact work in the fast-growing field of the formalisation of mathematics. |
References | - Modular Functions and Dirichlet Series in Isabelle/HOL, https://github.com/AnthonyBordg/Number_Theory (GitHub repo) - Tom Apostol, Modular Functions and Dirichlet Series in Number Theory, Springer - Isabelle Zulip chat: https://isabelle.zulipchat.com |
Prerequisite Skills | Mathematical Analysis; Algebra/Number Theory; complex analysis |
Other Skills Used in the Project | Previous experience with Isabelle/HOL or any other proof assistant (Coq, Lean ...) is desirable but not necessary. Please see https://www.cl.cam.ac.uk/research/hvg/Isabelle/index.html |
Programming Languages | Isabelle/HOL |
Bayesian Learning for Automation
Project Title | Bayesian Learning for Automation |
Keywords | Bayesian learning; Neural networks; MCMC; Reinforcement Learning |
Contact Name | Sumeetpal S. Singh |
Contact Email | sss40@cam.ac.uk |
Company/Lab/Department | Engineering |
Address | Department of Engineering, Trumpington Street, CB21PZ |
Period of the Project | 10-12 weeks (late June/early July start)) |
Work Environment | Join a team involving 2 other PhD students working on related topics. Based at the Department of Engineering with a mixture of remote and in-person presence. Periodic updates and discussions with sponsors Mathworks. |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | One very promising technique for automation is to gather data form an expert demonstration and then learn the expert's policy using Bayesian inference. The learnt policy is then extrapolated to automate the task in novel settings. The potential applications of this approach are numerous, e.g. automated navigation. The key challenges of this technique of ``control by mimicry'' are: 1. Learning the expert's policy, a function, and accurately representing uncertainty. 2. To improve knowledge of the expert's policy, more data needs to be gathered. This should be done sparingly, as data gathering can be expensive, and also be guided by an optimality criterion, such as an Information theoretic criterion. Impact on academic area, on user community, industry, and beyond: This is an exciting and novel endeavour that exploits recent advances in Statistics for challenging automation problems. [1] Simon L Cotter, Gareth O Roberts, Andrew M Stuart, and David White. MCMC methods for functions: modifying old algorithms to make them faster. Statistical Science, pages 424–446, 2013. [2] T. Sell and S.S. Singh. Trace-class Gaussian priors for Bayesian learning of neural networks with MCMC. Under review. Arxiv E-print arXiv:2012.10943. [3] https://gym.openai.com/ |
Brief Description of the Project | Deliverables: (i) A generic Matlab (potentially in collaboration with sponsor Mathworks) and Python implementation of our MCMC sampling algorithm for trace-class neural network priors [2] which can also be used more widely for other applications of Bayesian neural networks. (ii) Proof-of-concept automation implementations on exemplar tasks from AI Gym. (iii) Potential use of our Bayesian neural network sampling algorithm within the CUED curriculum (MEng level). Currently Bayesian neural networks are not taught, nor are they widely experimented with in the MEng projects. (iv) An outreach activity for years 5/6 school children in the field of Reinforcement learning/Automation via a Microbit implementation. ALL CODE WILL BE MADE PUBLIC. |
References | [1] T. Sell and S.S. Singh. Trace-class Gaussian priors for Bayesian learning of neural networks with MCMC. Under review. Arxiv E-print arXiv:2012.10943. [2] Simon L Cotter, Gareth O Roberts, Andrew M Stuart, and David White. MCMC methods for functions: modifying old algorithms to make them faster. Statistical Science, pages 424–446, 2013. [3] https://gym.openai.com/ |
Prerequisite Skills | Statistics; Probability/Markov Chains |
Other Skills Used in the Project | Numerical Analysis ;PDE's; Mathematical Analysis |
Programming Languages | Python; MATLAB; Delieverable in MATLAB as required by sponsor. |
AI for Coronary Artery Imaging
Project Title | AI for Coronary Artery Imaging |
Keywords | Autoencoders, deep learning, python, imaging, computer vision |
Contact Name | Mike Roberts |
Contact Email | mr808@cam.ac.uk |
Company/Lab/Department | DAMTP / Cardiology |
Address | Department of Medicine / DAMTP |
Period of the Project | 8 weeks + |
Work Environment | In DAMTP or the Department of Medicine |
Project Open to | Undergraduates; Master's (Part III) students |
Background Information | Coronary arteries can be imaged using Optical Coherence Tomography imaging and show features of the artery from the inside allowing for identification of disease. These images are extremely high dimensional making many deep learning methods intractable. |
Brief Description of the Project | We will apply deep learning methods for image compression, encoding and reconstruction to encode high dimensional images to low dimensional representations. This then allows for downstream identification of diseased tissue, quantification and prediction of outcomes. |
References | |
Prerequisite Skills | Image Processing |
Other Skills Used in the Project | |
Programming Languages | Python |