skip to content

Summer Research Programmes

 

This is a list of CMP academic project proposals from summer 2017.

Value of Information in Early Phase Clinical Trials
Mathematical Solutions for Aeronautical Gas-turbine Noise Emissions
Signal Extraction Techniques for 21-cm Cosmology Radiometric Experiments
Compressed Sensing Development for Nuclear Magnetic Resonance (NMR) Spectroscopy
Modelling the light distribution in photoacoustic imaging
Spatiotemporal dynamics of plant growth hormone gibberellin (GA) and cellular growth in plant cells
Stem cell packing
Single cell normalisation
Application of Compressive Sensing to Hyperspectral Endoscopy
Tensor Networks on TensorFlow
Modelling signal processing for stem cell control
Modeling the evolutionary dynamics leading to Acute Myeloid Leukemia
Detection of animal faecal parasites using image analysis with an ioLight portable microscope
Machine Learning for Internet Traffic Classification
Modelling Tandem Queues in Data Centres
Counting motifs in bipartite graphs
Music and Mathematics: Formal Methods in Style Detection
Image analysis of ice cream microstructure
Application of mathematics in real world clinical management

 


Value of Information in Early Phase Clinical Trials

 



Contact Name:

Simon Bond

Contact email:

simon.bond@addenbrookes.nhs.uk

Lab/Department:

Cambridge Clinical Trials Unit

Contact Address:

Addenbrooke's Hospital, Coton House Level 6 - Box 401

Hills Road

Period of the Project:

summer 2017

Brief Description of  Project:

Exploring and developing novel tools to design clinical trials and justify the choice of sample size.

 

A current standard method for choosing a sample size in confirmatory clinical trials is based around hypothesis testing and providing enough statistical power. This does not generalise at all well to exploratory studies in early clinical research. An alternative method called “Value of Information”, is rooted in decision theory and health economic analysis, and attempts to quantify the financial value of gaining information. Methods have been developed to parallel the traditional hypothesis-test-statistical-power approach of late stage confirmatory studies, but little work exists on early stage studies; particularly when there is uncertainty about the standard deviation of an endpoint, on top of uncertainty around its mean value.

Skills Required:

Statistical inference, computer programming

Skills Desired:

familiarity with R programming language

clinical trials

decision theory

economics

Project Open to:

Part III Students

Deadline to register interest:

 

[Return to List]

Mathematical Solutions for Aeronautical Gas-turbine Noise Emissions

 

 

Contact Name: Luca Magri
Contact email: lm547@cam.ac.uk
Lab/Department: Engineering Department
Contact Address: Engineering Department
JDB, Fluid Dynamics group
Trumpington Street
Cambridge CB2 1PZ
Period of the Project: June - Sept 2017
Brief Description of Project: Combustion noise is one of the dominant causes of noise pollution generated by the whole turbojet, which is bound to increase with the implementation of low-emission aeroengines. Entropy and vorticity inhomogeneities exiting the combustion chamber accelerated through the turbine are the two well-known indirect mechanisms that produce combustion noise. Recently, we discovered a third indirect noise mechanism caused by inhomogeneities in the gas composition. The system of hyperbolic equations was discretized numerically and solved. Numerical solutions are, however, time consuming and depend on the numerical scheme used.

The objective is to find a full analytical solution with mathematical techniques originated from Quantum Mechanics. The level of noise will be calculated for different Mach numbers, nozzle geometries and flow configurations. The analytical solution will overcome the approximate asymptotic methods used in the literature and industry.

Skills Required: Some familiarity with the following topics would be an advantage, but it not essential:

- Fluid dynamics and/or acoustics
- Hyperbolic Partial Differential Equations
- Perturbation methods

Skills Desired:  
Project Open to: Part III (master's) students, PhD Students
Deadline to register interest:  

[Return to List]

Signal Extraction Techniques for 21-cm Cosmology Radiometric Experiments

 

 

Contact Name: Eloy de Lera Acedo
Contact email: eloy@mrao.cam.ac.uk
Lab/Department: Physics
Contact Address: Astrophysics Group
Battcock centre for Astrophysics
Cavendish Laboratory
University of Cambridge
JJ Thomson Avenue
Cambridge CB3 0HE
Period of the Project: 8-12 weeks
Brief Description of Project: The Cosmic Dawn Epoch of the Epoch of Re-ionization are early cosmic epochs (until ~ 1 billion years after the Big Bang) when the Universe went from being a vast volume filled with chemical elements such as Hydrogen to become the realm of celestial objects (stars, galaxies, black holes, etc.) we can see today from Earth. The precise physical processes that took place at the time leading to the Universe we know today are unknown and their study is considered to be the last frontier in cosmological science. It is recognized that studying the red-shifted radio frequency emission from Hydrogen itself (the raw material that formed the first luminous objects) one could probe these epochs through cosmological time and shed light on some of the biggest remaining mysteries in the history of the cosmos. Super instruments like the SKA telescope will aim in a few years to do full tomography of these epochs. However, with a single antenna radiometer it is theoretically possible to detect the sky-averaged hyperfine transition line of atomic hydrogen (red-shifted from 21-cm to a few m due to the expansion of the Universe) in a matter of a few days. This however would require an extraordinary knowledge of the radio instrument and its spatial and spectral effects on both the cosmological signal and the much brighter foreground emissions. In this project the student will work on the development of the algorithms and signal extraction techniques required for the detection of the cosmic signal in this type of experiments. The student will work on a software framework and will back his findings with realistic simulations including the instrument model, the foregrounds and the cosmological signal to prepare for the observations and data analysis.
Skills Required: - Programming skills: Python/Matlab
- Knowledge of signal extraction techniques (eg. Matched filter)
- Knowledge of Bayesian theory
Skills Desired: - Background in experimental cosmology
- Strong background in signal extraction techniques (eg. Matched filter)
- Strong background in Bayesian theory
Project Open to: Undergraduates, Part III (master's) students, PhD Students
Deadline to register interest: 3 March

[Return to List]

Compressed Sensing Development for Nuclear Magnetic Resonance (NMR) Spectroscopy

 

 

Contact Name: Mark Bostock
Contact email: mjb218@cam.ac.uk
Lab/Department: Biochemistry (Laboratory of Dr. Daniel Nietlispach)
Contact Address: Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Old Addenbrooke's Site, Cambridge CB2 1GA.
Period of the Project: 8 weeks (negotiable)
Brief Description of Project: NMR spectroscopy is a widely used technique in Biology and Chemistry enabling atomic resolution structural analysis of molecules such as proteins, providing insights into their function and dynamic behaviour.

Over recent years we have been actively developing new data processing methodologies that enable the reconstruction of NMR data that is incompletely sampled in the time domain. Generally these methods are termed 'non uniform sampling’ (NUS) and typically these methods require non-Fourier Transform reconstruction techniques to convert irregularly sampled time domain data into the frequency domain. We have been working on the development and implementation of methodologies known as 'compressed sensing’ (CS), based on l_p-norm minimisation, where typically p=1. CS arose in the literature of information theory [1], [2] and has been applied widely for example in MRI [3] and NMR [4], [5] as well as other areas such as image compression, astronomy, tomography etc. [6]. The area is revolutionising the NMR field allowing us to obtain information which was previously inaccessible, and increasing the range of challenging biomolecules and scenarios which NMR can study. The main benefits of the combination of NUS recording and CS data reconstruction are increases in signal-to-noise and spectral resolution, both typically limiting factors in NMR spectroscopy.

Our current interests are the following:
1)Algorithm development
CS is an actively developing area within applied maths. New algorithms are regularly released with improvements in speed and reconstruction accuracy. We are interested in implementing some of these new algorithms for NMR data processing and assessing any improvements over the existing algorithms. This would require a literature search to identify new algorithms and then coding the algorithm within our existing software package for application to NMR data reconstruction.

2)Reducing the sampling requirements with prior information
Prior information is often available in NMR studies from existing experiments. Repeat measurements frequently look at spectral changes (difference experiments) to track biological processes. This prior information should allow a substantial reduction in sampling requirements. This would involve working with, for example, protein dynamics data and developing the existing algorithm to use available prior information.

3)Investigating the optimum sampling requirements
A currently under-explored area of this field is determining the optimum sampling schedule to use for data acquisition. While some general principles are understood [7], the requirements are likely to vary dependent on the properties of different experiments [8]. A good/bad sampling schedule can have a significant impact on reconstruction quality. Consequently a significant part of this research will involve developing metrics to assess the quality of different schedules and identifying optimum schedules for different experiments.

4)Develop GPU (graphical processing unit) approaches
Reconstruction times for CS processing of NMR data benefit significantly from parallelisation. GPUs are a currently under-exploited resource in this area. Adapting the existing code to use GPUs would provide further significant speed-ups in processing time, further expanding the range of experiments that can be studied.

References:
[1]E. J. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information.,” IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489–509, Feb. 2006.
[2]D. L. Donoho, “Compressed sensing.,” IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1289–1306, Apr. 2006.
[3]M. Lustig, D. L. Donoho, and J. M. Pauly, “Sparse MRI: The application of compressed sensing for rapid MR imaging.,” Magn. Reson. Med., vol. 58, no. 6, pp. 1182–1195, Dec. 2007.
[4]D. J. Holland, M. J. Bostock, L. F. Gladden, and D. Nietlispach, “Fast multidimensional NMR spectroscopy using compressed sensing,” Angew. Chemie Int. Ed., vol. 50, no. 29, pp. 6548–6551, Jun. 2011.
[5]K. Kazimierczuk and V. Y. Orekhov, “Accelerated NMR spectroscopy by using compressed sensing,” Angew. Chemie Int. Ed., vol. 50, no. 24, pp. 5556–5559, Apr. 2011.
[6]D. J. Holland and L. F. Gladden, “Less is More: How Compressed Sensing is Transforming Metrology in Chemistry,” Angew. Chemie Int. Ed., vol. 53, pp. 13330–13340, 2014.
[7]S. G. Hyberts, H. Arthanari, and G. Wagner,

 
 
 
 
 
 
 
Skills Required: Good programming expertise, particularly with Python. Interest in information theory.
Skills Desired:  
Project Open to: Undergraduates, Part III (master's) students, PhD Students
Deadline to register interest: 3 March

[Return to List]

Modelling the light distribution in photoacoustic imaging

Contact Name: Joanna Brunker
Contact email: jb2014@cam.ac.uk
Lab/Department: Department of Physics and CRUK Cambridge Institute
Contact Address: Cancer Research UK Cambridge Institute
University of Cambridge
Li Ka Shing Centre
Robinson Way
Cambridge
CB2 0RE
Period of the Project: 31 July – 22 September
Brief Description of Project: Photoacoustic imaging is an emerging technique involving generation of ultrasound using pulses of laser light. Absorption of light by tissue chromophores such as haemoglobin in blood induces a small temperature rise leading to an increase in pressure, and consequently generation of ultrasound waves. Detection of the ultrasound at the tissue surface enables a map of tissue absorption to be reconstructed, since the amplitude of the ultrasound signal is proportional to the absorption. However, the photoacoustic imaging community faces a significant challenge in using the reconstructed images to accurately quantify the concentration of the absorbers, for example to calculate the concentration of oxyhaemoglobin in blood to find the blood oxygenation. The reason for this is that the absorption is proportional not only to the absorption coefficients of the tissue components, and their concentration, but also to the intensity of light incident on these tissue components (the light fluence). The light fluence distribution can be estimated experimentally, or by using models such as Monte Carlo or the diffusion equation, and then used to correct the photoacoustic images to more accurately quantify the absorber concentrations. We have already successfully implemented a correction using the diffusion equation in 2D, but still need to validate this against other models such as Monte Carlo, and to investigate corrections in 3D. The project will address these challenges using MATLAB modelling of both simulated and experimental data representing tissue absorption in a living mouse.
Skills Required: Proficiency with MATLAB
Skills Desired: Familiarity with image reconstruction and light models
Familiarity with Monte Carlo and related computational algorithms
Project Open to: Undergraduates, Part III (master's) students, PhD Students
Deadline to register interest: 3 March

[Return to List]

Spatiotemporal dynamics of plant growth hormone gibberellin (GA) and cellular growth in plant cells

Contact Name: Alexander Jones
Contact email: alexander.jones@slcu.cam.ac.uk
Lab/Department: Sainsbury Laboratory Cambridge University
Contact Address: Sainsbury Laboratory, Cambridge University
Bateman St.
Cambridge, CB2 1LR
Period of the Project: 8 weeks
Brief Description of Project: A major challenge in plant biology is to understand how multicellular organisms integrate dynamic developmental and environmental inputs to drive cellular responses. These cellular processes are well orchestrated across the spatial and temporal scales to enhance tissue- and organ-level functionality. Understanding the mechanisms that control the cellular processes, such as hormone levels and cellular growth, requires a multidisciplinary and systems-level approach.

The project is focused on using a combined approach of cell biology, molecular genetics, and mathematical modelling to visualize and quantify hormone levels that contribute to plant cellular growth. The plant hormone gibberellin (GA) is a powerful growth regulator that controls key developmental transitions such as germination, flowering and fruiting. The Jones group is using a novel FRET biosensor for GA (GPS1) to reveal GA patterns and dynamics in living plants. As a first step, GA will be visualized (using GPS1 biosensor) in growing cells using confocal microscopy, and a spatiotemporal map of GA levels and cellular growth rates will be developed. To test hypotheses about the dynamic relationship between GA levels and cell growth, GA levels will be manipulated using mutant plants defective in GA biosynthesis and treatments with exogenous GA. The quantitative data generated from these experiments will be used to feedback on ongoing 3D finite element model of growing cells, and the key predictions of the models will be tested through the experimental approaches. We anticipate an iterative combination of modelling and experimental work leading to a holistic understanding of how spatiotemporal patterns of GA growth hormone quantitatively and mechanistically relate to cellular growth.

The student will gain experience in: 1. plant culture methods; 2. cutting-edge confocal microscopy techniques (FRET) to collect 3D time-lapse images of GA levels and cellular growth in growing cells; 3.advanced 4D image processing; 4. develop metrics describing the quantitative relationship between GA levels and cell growth; 5. participate in the ongoing development of a 3D finite element computational model aimed at integrating GA hormone levels, physical properties of cell wall, and cellular growth. The student will work closely with a postdoc (Ankit Walia) to collect and analyze data.

Skills Required: Basic knowledge of biology and programming in e.g. matlab/python/c++ would be an advantage, but there are no strict prerequisites for this project.
Skills Desired: An interest in plant development and microscopy would be a great start!
Project Open to: Undergraduates, Part III (master's) students
Deadline to register interest: 3 March

[Return to List]

Stem cell packing

Contact Name: Lee Hazelwood
Contact email: lee.hazelwood@cruk.cam.ac.uk
Lab/Department: CRUK
Contact Address: Dr Lee Hazelwood
Cancer Research UK Cambridge Institute
University of Cambridge
Li Ka Shing Centre
Robinson Way
Cambridge CB2 0RE
Lee.Hazelwood@cruk.cam.ac.uk
Period of the Project: 8 weeks
Brief Description of Project: Intestinal crypts contain stem cells that maintain the intestine. These stem cells replicate and differentiate into paneth cells in the intestine forming an organised intercalated chessboard like pattern, see Figure 1a in ref [1]. The aim of this project is to develop a Cellular Potts model in order to understand how intercellular interactions lead to the observed stem and paneth cell organisation. We will do this by developing a Cellular potts code based on the original work by Craner and Glazier [2] for a simple geometry 2D geometry.

This project is self contained.

[1] Sato et al. Nature 469, 415–418 (20 January 2011) doi:10.1038/nature09637
[2] Graner, F., Glazier, J.A.: Simulation of biological cell sorting using a two dimensional extended Potts model. Phys. Rev. Lett. 69, 2013–2016 (1992)

Skills Required: Coding in Matlab, C or R.
Previous experience in carrying out some simulations on Grids.
Skills Desired: Physical understanding of energy functions.
Project Open to: Part III (master's) students, PhD Students
Deadline to register interest: 3 March

[Return to List]

Single cell normalisation

Contact Name: Lee Hazelwood
Contact email: lee.hazelwood@cruk.cam.ac.uk
Lab/Department: CRUK
Contact Address: Dr Lee Hazelwood
Cancer Research UK Cambridge Institute
University of Cambridge
Li Ka Shing Centre
Robinson Way
Cambridge CB2 0RE
Lee.Hazelwood@cruk.cam.ac.uk
Period of the Project: 8 weeks
Brief Description of Project: We are beginning to acquire genetic measurements (expression levels of all genes) from the single biological cells that comprise tissues. Early data indicates these genetic measures are heterogeneous across, potentially indicating different developmental programmes for these cells. However, one potential confounding factor to our confidence in these predictions is our ability to measure lowly expressed genes, which is often absent from the data. See for example [1].

This project will look at different methods to normalise this data and see how they are affected by missing data. Possible parts of the project depending on ones background and interest might include
- simulating idealised datasets for normalisation (require general coding)
- normalising real datasets using different methods (requires R experience)
- developing a novel normalisation method (more mathematical)

The project is self contained.

[1] A step-by-step workflow for low level analysis of single cell RNA-seq data with bioconductor, Lun AT, McCarthy, DJ, Marioni JC, F1000Res. 2016 Aug 31;5 2122.

Skills Required: Familiarity with coding in Matlab, C and R.
Simulation of data according to particular distributions.
Experience using convolutions.
Skills Desired: Interest in bioinformatics.
Project Open to: Part III (master's) students, PhD Students
Deadline to register interest: 3 March

[Return to List]

Application of Compressive Sensing to Hyperspectral Endoscopy

Contact Name: Dr. Sarah Bohndiek
Contact email: seb53@cam.ac.uk
Lab/Department: Physics
Contact Address:

Dr. Jonghee Yoon (daily supervisor)
jy385@cam.ac.uk
Department of Physics

Period of the Project: 8 - 10 weeks (negotiable before starting the project)
Brief Description of Project: Endoscopic surveillance is crucial for diagnosing cancer in gastrointestinal tract. Current endoscopic methods measure structural information and some biochemical features of tissue but this information does not provide sufficient contrast for early detection of the disease. In order to overcome this limitation, we are developing hyperspectral endoscopy. Hyperspectral imaging techniques measure the full spatial and spectral characteristics of tissue, which would enable early detection by combining contrast relating to structural, biochemical and molecular characteristics of the tissue. Both scanning (spatial & spectral) and snap-shot methods can be used to obtain hyperspectral images requiring either long operation times or limited spatial / spectral resolution, respectively. In this project, we will exploit compressive sensing (CS) techniques to develop real-time hyperspectral endoscopy while retaining high spatial and spectral resolution. CS is an imaging compression method that can achieve high speed by reducing the number of samples required to reconstruct an image. Recently, CS has been applied to various biomedical techniques including fluorescence microscopy, photoacoustic microscopy, X-ray imaging, and hyperspectral imaging. However, there are many challenges in applying CS to hyperspectral endoscopy. The aim of this project would be to develop new theoretical CS framework for hyperspectral endoscopy. The candidate would require a literature search to identify opportunities for CS and then perform simulations to develop and validate the new algorithm. Time allowing, there will be opportunity to apply the new algorithm to real hyperspectral endoscopic experiments ongoing in the laboratory. The developed CS method would be versatile for transferring hyperspectral endoscopy to clinical level.

References
[1] Studer V, Bobin J, Chahid M, Mousavi HS, Candes E, Dahan M. Compressive fluorescence microscopy for biological and hyperspectral imaging. P Natl Acad Sci USA 2012;109:E1679-E1687
[2] Zhao R, Wang Q, Shen Y, Li J. Multidimensional dictionary learning algorithm for compressive sensing-based hyperspectral imaging. Journal of Electronic Imaging 2016;25:063013-063013

Skills Required: Programming skills (Matlab), but they can be learned on the project
Skills Desired: Familiarity with compressed sensing theory and developing algorithms. Ability to communicate with non-mathematicians
Project Open to: Undergraduates
Deadline to register interest:  

[Return to List]

Tensor Networks on TensorFlow

Contact Name: Austen Lamacraft
Contact email: al200@cam.ac.uk
Lab/Department: Physics
Contact Address: TCM, Department of Physics, Cavendish Laboratory
Period of the Project: Summer 2017
Brief Description of Project: Understanding large quantum systems is central to many areas of science. The exponentially large Hilbert space of the problem is the main impediment to any general calculational method. A recent breakthrough in describing low dimensional systems (normally chains of interacting spins) uses tensor networks, in which the state of the system is represented using a network of matrices of much smaller dimension than that of the full Hilbert space.

The goal of this project is to implement a tensor network scheme using TensorFlow, Google's open source library for computations using data flow graphs. This is an efficient way to represent the tensor network algorithm, and allows performance to be maximized using GPUs.

See https://arxiv.org/abs/1609.01552 for an introduction to the physics of the problem.

Skills Required:  
Skills Desired: Familiarity with Python.
Project Open to: Part III (master's) students, PhD Students
Deadline to register interest: 3 March 2017

[Return to List]

Modelling signal processing for stem cell control

Contact Name: Henrik Jönsson
Contact email: henrik.jonsson@slcu.cam.ac.uk
Lab/Department: Sainsbury Laboratory
Contact Address: Sainsbury Laboratory, Bateman Street, Cambridge CB2 1LR
Period of the Project: 8 weeks
Brief Description of Project: "Continuous organ formation in plants is driven by a cluster of stem cells in the tip of their shoots. A regulatory network decides which cell should maintain its stem cell identity, and which cell should instead specialize and start forming a new organ. At the centre of this regulatory network lies the negative feedback between the protein CLAVATA3 (CLV3) (produced by the stem cells) and the stem cell promoting transcription factor WUSCHEL (WUS), produced in a separate region of cells.

The dynamics of negative feedback is well understood, but things get more complicated as the signalling from CLV3 to WUS goes through several receptors. The receptors all perform signal processing on their own and at least one of them exhibit adaptation, meaning that it effectively filters out the average CLV3 input and only responds to recent changes. How do the other receptors work, and how is the signalling from the three integrated to a single output?

There is published data on what happens when you mutate these receptors, both one by one and in pairs. There is also data for how WUS is affected by changes in CLV3 concentration. With this, and with computational modelling, we aim to infer the relationship between these receptors, and what signal processing they must perform.

This project have some clear goals, and a well defined place to start. But the project is easily extensible, and while we have ideas of where one could go next you would be in charge of your own research. The project is also adaptable from a more analytical to a mainly computational approach.


 
Skills Required: Solving and analysing differential equations.
Skills Desired: Experience with programming/scripting will be useful.
Project Open to: Undergraduates, Part III (master's) students, PhD Students
Deadline to register interest: 3 March 2017

[Return to List]

Modeling the evolutionary dynamics leading to Acute Myeloid Leukemia

Contact Name: Jamie Blundell
Contact email: jrb75@stanford.edu
Lab/Department: Cambridge Cancer Center & Deptartment of Oncology
Contact Address: Cambridge Cancer Center, Early Detection Program
Hutchison MRC Research Center
Cambridge Biomedical Campus
Hills Road
Cambridge CB2 0XZ
Period of the Project: June - Sept (flexible)
Brief Description of Project: Cancer is an evolutionary disease, but our quantitative understanding of its onset and progression remain basic. It is widely accepted that cancer develops by alterations in specific sets of genes. While a huge number of studies over the past decade have identified such alterations, much remains to be discovered about cancer dynamics, especially at a quantitative level. Better treatments will come from a deeper understanding of cancer’s evolutionary dynamics and will provide vital information for early detection and intervention. However, addressing these questions relies on the ability to quantitatively understand the clonal dynamics in both healthy and transformed tissues.

The need for a quantitative understanding of cancer evolution is keenly highlighted by blood malignancies such as Acute Myeloid Leukemia (AML). At diagnosis, most AML patients harbour mutations in a ~3 - 10 out of a possible ~100 or so AML-associated genes. The currently accepted theory is that these mutations are acquired sequentially: clones with more mutations are fitter than those with fewer and selectively outcompete them. However, given the low mutation rates and lack of genomic instabilities in these cancers, exactly how it is possible to accumulate this many mutations in slowly cycling stem cell populations has been difficult to explain. Further complicating the issue, recent deep sequencing studies suggest that even in “healthy” people, there is a large degree of clonal evolution: subclones harboring different sets of somatic mutations compete with one another over the course of your life.

The goal of this research project will be to model how mutations arise and expand in tissues in order to understand what genomic signatures might be indicative of early cancer. Specifically we will: (i) Develop a stochastic “null” model of mutation dynamics in healthy (non-cancerous) tissues, maintained by a hierarchy of cell types. (ii) Modify the above model to include “driver” mutations that disrupt the homeostatic balance of the tissue, to model early cancer onset.

Methods used will be a combination of branching processes, nonlinear dynamics and stochastic simulations.

Skills Required: Some exposure to stochastic processes, probability, asymptotic methods desired. Some programming also desired (Python, Matlab or similar).
Skills Desired:  
Project Open to: Part III (master's) students, PhD Students
Deadline to register interest: 3rd March

[Return to List]

Detection of animal faecal parasites using image analysis with an ioLight portable microscope

 

Contact Name: Carola-Bibiane Schönlieb
Contact email: cbs31@cam.ac.uk
Lab/Department: DAMTP
Contact Address: DAMTP, Wilberforce Road, CB3 0WA
Period of the Project: 8 weeks
Brief Description of Project: The over use of antibiotics is causing pathogens to become resistant to antibiotics, potentially leaving us without effective treatments for a wide range of conditions. One of the causes of antibiotic resistance is the regular and indiscriminate use of antibiotics in livestock - frequently farmers see a problem in one animal and immediately treat all of the heard or flock with antibiotics as a precaution, without establishing if antibiotics are necessary. The authorities are now starting to limit the use of antibiotics in livestock by imposing strict limits on the levels of antibiotics that the meat can contain. This forces farmers and vets to be much more selective with antibiotics and only use them when really needed. To do this, farmers need to be able to accurately diagnose disease on the farm, rather than sending samples off to a lab and waiting days for a diagnosis.
Intestinal parasites are one of the more common issues for which antibiotics are used in cattle and sheep. There are a large number of parasites and only a small number are harmful, so to determine if antibiotics are required the faeces need to be analysed and the number and type of parasites present measured. This is done using a microscope to look for parasite eggs in the faeces, then counting the different types of egg found. Normally this is done with a bench microscope in a lab, and an experienced technician or parasitologist to manually identify and count the eggs. The new ioLight portable microscope has sufficient resolution to enable this to be done in the field, thus reducing the time required for diagnosis so the farmer immediately knows if treatment with antibiotics is appropriate or not. It is impractical to train farmers to manually count and identify eggs, so an automated image analysis solution is required to analyse the images from the microscope and count eggs. The image analysis software needs to work in the field, so it needs to operate on the relatively low powered Raspberry Pi within the microscope, but still deliver results quickly.

This project is to develop image analysis software running on a Raspberry Pi to analyse microscope images of faecal samples, identify the eggs present, then measure dimensions of the eggs and other simple parameters that could then be used to categorise the eggs. Efficient image analysis algorithms are required to ensure that the results are delivered to the user without significant delay.
To enable this project, ioLight will provide a microscope (containing the Raspberry Pi) and give assistance in learning how to use the microscope and get access to the images on the Raspberry Pi. MatLab, running on the Raspberry Pi could be used to develop the image analysis software.

This project will be jointly supervised by
Joana Grah jg704@cam.ac.uk
Jasmina Lazic Jasmina.Lazic@mathworks.co.uk
Stefanie Reichelt Stefanie.Reichelt@cruk.cam.ac.uk
Carola-Bibiane Schönlieb cbs31@cam.ac.uk
Richard Williams richard.williams@iolight.co.uk

Skills Required: Maths 1a and 1b. Versatile in MATLAB programming. A curious mind and a passion for problem solving.
Skills Desired: Numerical analysis (numerical linear algebra and numerics for differential equations), harmonic analysis, partial differential equations.
Experience in programming on the Raspberry Pi.
Project Open to: Undergraduates, Part III (master's) students
Deadline to register interest: 3 March 2017

[Return to List]

Machine Learning for Internet Traffic Classification

 

Contact Name: Dr Andrew W Moore
Contact email: andrew.moore@cl.cam.ac.uk
Lab/Department: Computer Laboratory
Contact Address: Computer Laboratory,
William Gates Building,
University of Cambridge,
15 JJ Thomson Avenue
Period of the Project: 8 weeks (negotiable)
Brief Description of Project: In an original piece of research in 2005, by Denis Zuev (Part III) and Andrew W. Moore of the Computer Laboratory, we showed how Bayes methods of machine-learning were, when combined with high-quality ground-truth data, able to provide a useful Internet application-identification. In the intervening years there has been a revolution in Internet applications, in machine-learning methodologies, and in applications toward which such information may be put.

This proposal would be to consider evaluations of both the original data sets and new data-sets using a number of modern methodologies for training and testing Internet traffic. This would permit us to evaluate such methods in new environments, e.g., the data-center, and to consider the actual value of new methodologies when compared with long-standing mechanisms for constructing Bayesian priors.

Practically this work will start by identifying suitable ML algorithm candidate(s) and make baseline comparison with the datasets of the 2005/6/7 work. This will enable a direct comparison of the value of these new algorithms, and in particular to explore the opportunity for continuous refinement of the prior.

References:
[1] Internet Traffic Classification Using Bayesian Analysis Techniques
Moore AW, Zuev D
ACM SIGMETRICS 2005
http://www.cl.cam.ac.uk/~awm22/publications/moore2005internet.pdf

[2] Bayesian Neural Networks for Internet Traffic Classification
Auld T, Moore AW, Gull SF
IEEE Trans on NN
http://www.cl.cam.ac.uk/~awm22/publications/auld2006bayesian.pdf

[3] Probabilistic Graphical Models for Semi-Supervised Traffic Classification
Rotsos C, VanGael J, Moore AW, Ghahramani Z.
http://www.cl.cam.ac.uk/~awm22/publications/rotsos2010probabilistic.pdf


 
Skills Required: Statistics background
Skills Desired: Machine learning background
Project Open to: Part III (master's) students, PhD Students
Deadline to register interest:  
For Industrial Hosts:  

[Return to List]

Modelling Tandem Queues in Data Centres

Contact Name: Andrew W Moore
Contact email: andrew.moore@cl.cam.ac.uk
Lab/Department: Computer Laboratory
Company: University of Cambridge
Contact Address: Computer Laboratory,
William Gates Building,
University of Cambridge,
15 JJ Thomson Avenue
Period of the Project: 8 weeks (negotiable)
Brief Description of Project: The modelling of networks of queues has been an extraordinarily successful technique to understand traffic behaviour in (computer packet) networks. Queues within computer networks are often modeled as independent and isolatable. The practical reality is computer networks are sets of tandem queues, each with different arrival and processing distributions.

Modelling to date has relied on the independence of queues for (simplistic) modeling; yet the integration of the tandem nature is important to have the models encapsulate properties within modern data-center networks. Additionally, treating a collection of (tandem) queues as a reduced number of queues enables practical simulation and reproduction of queueing impact within large scale queue structures - such as DC networks - using current simulation tools.

As data centres grow in scale and complexity, common simulation tools can no longer support classical queueing modeling used in legacy models. This proposal would be to consider the collapse of multiple queues models into a complex single queue, representing a single end-host view of the queueing of packets within the network and the delay introduced by queuing for each packet.

Beyond the theoretical modelling, this work will aim to validate the conclusions using real packet traces measurements. This will enable a direct comparison of the theoretical modelling of the queueing in the network, using both legacy simulators and the new models, with the actual packet behavior.

Skills Required: A strong statistics background, with a good knowledge of queueing theory.
Skills Desired:  
Project Open to: Part III (master's) students, PhD Students
Deadline to register interest:  
For Industrial Hosts:  

[Return to List]

Counting motifs in bipartite graphs

Contact Name: Benno Simmons
Contact email: benno.simmons@gmail.com
Lab/Department: Conservation Science Group, Department of Zoology
Contact Address: Department of Zoology, University of Cambridge, The David Attenborough Building, Pembroke Street, Cambridge, CB2 3QZ, UK
Period of the Project: 8 weeks between late June and September, but can be flexible
Brief Description of Project: In 2002, the concept of network motifs was introduced for unipartite graphs. Motifs are subgraphs defined by a particular pattern of interactions between vertices. Particular motifs may reflect particular functions in real-world networks, and have been taken up widely in fields such as biology, for the analysis of food webs and gene regulatory networks. However, the approach has rarely been applied to bipartite graphs, and no widely accessible methods or software packages yet exist for doing so.

Bipartite graphs are widely used across a range of disciplines. In economics, trade networks can depict edges between countries and the products they export. In the social sciences, affiliation networks, such as those depicting edges between users and the movies they have watched, are common. In ecology, bipartite graphs have been very widely adopted. For example, communities of plants and their pollinators can be represented as bipartite graphs, with plants and pollinators as two sets of vertices, and edges representing their interactions. Other types of ecological interaction can be represented the same way, such as those between plants and the birds that disperse their seeds, or between hosts and their parasites.

Alongside my collaborators, I have been working to extend the concept of motifs to undirected bipartite graphs, and develop code (in R, MATLAB and Python) so that these methods can be widely adopted.

We have already developed methods to count the frequency of all possible undirected bipartite motifs containing 2, 3, 4 and 5 vertices in any bipartite graph. The next stage is to develop methods for counting the number of times each vertex occurs in each topologically unique position within each motif. Once this is completed, we want to extend our methods to counting motifs containing 6 vertices.

The student would likely be focussed on (i) developing methods to count the frequency of motifs containing 6 vertices in bipartite graphs and, (ii) developing methods to count the number of times each vertex occurs in each topologically unique position within each of these 6-vertex motifs. Ideally, these methods would then be written into code (preferably in R or MATLAB).

The student would then be free to pursue questions of interest. As an ecologist, I’m particularly interested in ecological networks (especially those characterising mutualistic interactions between species, such as plant and pollinators). I’m especially interested in recurrent and statistically significant motifs, and what they can tell us about (i) the evolutionary processes involved in network assembly, and (ii) the robustness of graphs to simulated perturbations. Examining robustness can involve simple topological models, or more complex approaches, such as modelling population dynamics on networks using generalised Lotka-Volterra equations. Motifs can also help answer questions such as whether the ways in which species are embedded in networks is a fundamental property of the species involved, or a function of context. Such approaches could help advance our understanding of ecological communities, and eventually inform conservation to ensure their protection in the future.

If the student is keen to focus on other types of bipartite graph, this also possible: the project could follow the interests of the student once the core methods have been developed.

Skills Required: Graph theory, coding (R or MATLAB)
Skills Desired:  
Project Open to: Undergraduates, Part III (master's) students, PhD Students
Deadline to register interest: 3 March
For Industrial Hosts:  

[Return to List]

Music and Mathematics: Formal Methods in Style Detection

Contact Name: Prof Pablo Padilla
Contact email: pp432@cam.ac.uk
Lab/Department: Fitzwilliam College
Contact Address: Prof Pablo Padilla (Visiting Fellow in Mathematics, Fitzwilliam College)

Francis Knights (Fellow and DoS in Music, Fitzwilliam College)

Period of the Project: 4-8 weeks, variable
Brief Description of Project: The project (http://formal-methods-in-musicology.webnode.com/) focuses on developing mathematical methods to classify and establish authorship of musical material, here specifically two high-quality groups of 17th-century French keyboard music by members of the famous Couperin dynasty. From a practical perspective the aim is to develop computational tools that can classify musical works depending on their stylistic similarity. Additionally, the results should be of interest in algorithmic composition and generative music as well as in a more theoretical approach to the evolution of style and musical analysis.

Currently, we are exploring three approaches and we are looking for interested students to become involved with work on each of these:

1. Probabilistic and statistical models
Probabilistic and statistical tools such as principal component analysis, clustering analysis and stochastic processes (Markov chains and generalizations) are being applied to characterize pieces of music from a stylistic perspective.

2. Graph theoretical methods
Studying stylistic similarities leads in a natural way to the study of connectivity and other topological properties such as modularity, community structure, and motif detection in graphs associated to the studied musical works.

3. Hierarchical signal processing and wavelet
Signal processing tools can be applied to study similarities of the information content of musical fragments. New insights are needed to understand the different levels of musical structure.

Skills Required: Some of the following, depending on the project:

Probability, Statistics, clustering
Graph theory
Hierarchical signal processing and wavelet

Skills Desired: Contributions from graduate students with interests in any of these particular areas are welcomed. They will engage with the theoretical as well as the computational aspects of the project; some musical knowledge would be a benefit.
Project Open to: Undergraduates, Part III (master's) students, PhD Students
Deadline to register interest: 10 April

[Return to List]

Image analysis of ice cream microstructure

Contact Name: Dr Carola Schönlieb (DAMTP), Rob Tovey (DAMTP), Peter Schuetz (Unilever)
Contact email: C.B.Schoenlieb@damtp.cam.ac.uk
Lab/Department: DAMTP
Contact Address: Centre for Mathematical Sciences, Wilberforce Road, Cambridge, CB3 0WA,
Period of the Project: 8 weeks during the coming summer break.
Brief Description of Project: Ice cream is a complex multiphase material that has a microstructure that consists of ice crystals and air bubbles bound together by a “matrix phase”, which is an aqueous solution containing sugars, proteins and emulsified fat. Maintaining this structure is an essential part of delivering a high-quality ice cream to the consumer.
The best way to analyse the ice cream microstructure is to image this frozen structure using cryo-SEM (scanning electron microscopy). An example of the obtained images is shown here. Currently these images are mainly compared qualitatively as the segmentation needed to extract quantitative information about size distributions on ice crystals and air bubbles is currently done manually and is therefore extremely time consuming if reliable statistical data is required. Previous attempts to automate the segmentation in our typical images using standard software and processes were unsuccessful due to the lack of differentiation between air and ice structures and problems in robust recognition of the outline of ice crystals.
With this project, we want to explore new approaches to allow a robust segmentation that could then be used as a routine method to extract quantitative information from the SEM images of the ice cream microstructures.
Skills Required: Experience in MATLAB or Python programming
Skills Desired: Numerical analysis, functional analysis, discrete geometry, convex optimisation
Project Open to: Current Part IB, II and Part III students in Mathematics Tripos
Deadline to register interest: 15 June 2017

[Return to List]

Application of mathematics in real world clinical management

Contact Name: Dr Rameen Shakur
Contact email: rameen@camheartwear.com
Lab/Department: Cambridge Heartwear
Contact Address: Cambridge Heartwear
23, Cambridge Science Park
Milton Road, Cambridge
CB4 OFN

Telephone: 01223 437144

Period of the Project: 8 weeks
Brief Description of Project: We would like to work with an enthusiastic student who is able to organise and analyse a number of clinically rich real world data in the management of a few active chronic diseases. This will involve the deep analysis of a number of clinical parameters and assessing the potential for modelling prospective physiological changes.
Skills Required: Computer literate
Able to organise multiple computer files of clinical data
Self motivated
Free thinker
Skills Desired: Punctual
Interested in health care and patient benefit
Project Open to: Undergraduates, Part III (masters) students, PhD Students
Deadline to register interest: 30 June

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[Return to List]