Below you will find the list of CMP industrial projects hosted by external companies. Click here to see the list of academic projects hosted by other departments and labs within the university.
New projects may be added so to check back regularly!
You can enquire about the projects you are interested in by writing to the contact given in the project listing. Unless alternative instructions are given in the project listing, to apply for a project you should send your CV to the contact along with a covering email which says why you are interested in the project and why you think you would be a good fit. Please note, some projects have an earlier application deadline than the general CMP deadline of Friday 23 February 2024.
For tips on making strong placement applications see the advice here from CMP Co-Founder James Bridgwater. It’s better to put the work into making fewer but stronger applications than firing off a very generic application to all projects – you won’t stand out with the latter approach!
At the CMP Lunchtime Seminar Series in February 2024 many hosts gave short presentations about their projects. Where possible these were recorded and links to the videos posted on the seminar webpage.
Please note that to participate in the CMP programme you must be a student in Part IB, Part II, or Part III of the Mathematical Tripos at Cambridge.
Industrial CMP Project Proposals for Summer 2024
- Novo Nordisk, Digital Science and Innovation, Machine Intelligence - Providing genomic and molecular network context for in-silico predicted drug-target candidates to aid hypothesizing their mode of action
Keywords: Genome-wide association study, drug discovery, mode of action, biomedical knowledge graph, graph link prediction - Riverlane - Internship in Quantum Computing
Keywords: Quantum, Computing, Algorithms, Software, Programming - Silvaco Europe, TCAD. - Finite difference approximation of the Stokes flow with free interfaces on staggered Cartesian grids
Keywords: Stokes flow, free interface, finite difference methods, numerical linear algebra. - ORCA Computing - Solving graph problems with a photonic quantum processor
Keywords: quantum computing, graph problems - Barclays - Risk management in FX electronic trading
Keywords: Finance, Algo-trading, Execution, Market Making, Portfolio Management - Lyzeum Ltd. - Developing an explainable segmentation tool for duodenal biopsy diagnosis from whole slide images, to maximise likelihood of clinical adoption
Keywords: Machine Learning, Image Segmentation, Medical Imaging, Explainable AI, Digital Pathology - APEX Horticulture - Evaluating the impact of alternative freight methods on the transportation of cut roses from the Equator to Europe
Keywords: Horticulture - Silvaco Europe, TCAD process simulation - Study of ion collision models during implantation.
Keywords: Mathematical modelling, monte carlo simulation, mathematical physics, semiconductors - Novo Nordisk, Digital Science and Innovation, Human Genetics CoE - Diffusion Models for Context-Aware Splice Site Prediction
Keywords: Diffusion Modelling, Splice-Site Prediction, Genetics, Drug Target Discovery, Machine Learning - Silvaco Europe, TCAD - Mathematical models of thermal oxidation of silicon carbide: verification and calibration
Keywords: Mathematical modelling, simulation, verification, calibration, oxidation - Siemens Power Electronics Innovation Hub - Grid impedance estimation by Machine learning / AI methods
Keywords: Machine Learning, grid impedance estimation, AI, Power Electronics, Matlab / Simulink - Siemens Power Electronics Innovation Hub - Reduction and efficient computation of Neural Networks
Keywords: Neural Network, Algorithms, AI, Optimisation - Aspect Capital - Pricing and hedging short term interest rate options
Keywords: Interest Rates, Options, Pricing, Implied Volatility Fitting, Fixed Income, Black Scholes, Normal vs LogNormal distributions, Exchange Trading, Over-The-Counter Trading, Relative Value - Pharo Management - Changepoint Detection in Financial Time Series
Keywords: Changepoint detection, financial Mathematics, Time series analysis
Providing genomic and molecular network context for in-silico predicted drug-target candidates to aid hypothesizing their mode of action
Project Title | Providing genomic and molecular network context for in-silico predicted drug-target candidates to aid hypothesizing their mode of action |
Keywords | Genome-wide association study, drug discovery, mode of action, biomedical knowledge graph, graph link prediction |
Project Listed | 5 January 2024 |
Project Status | Filled |
Contact Name | Marie Lisandra Zepeda Mendoza |
Contact Email | vmnz@novonordisk.com |
Company/Lab/Department | Novo Nordisk, Digital Science and Innovation, Machine Intelligence |
Address | 1 Pancras Square, London, N1C 4AG |
Project Duration | 8 weeks |
Project Open to | Master's (Part III) students |
Background Information |
Novo Nordisk Research Centre Oxford (NNRCO) is an innovative target discovery and translational research unit with a focus on identifying novel therapies for patients with cardiometabolic diseases (e.g. diabetes mellitus, obesity). To identify novel drug targets, we employ a variety of advanced computational biology techniques (graph data science and machine learning) on a myriad of data sources (genomics, transcriptomics, etc). In particular, the Machine Intelligence group, has developed graph-based machine learning models to perform link prediction onto knowledge graphs (KGs) and has explored the positive impact that enriching a KG with task-specific information to explore the molecular landscape of complex pathologies. Genetic information provides direct evidence of the relevance of a drug-target candidate to a disease, however, the identifications come without a clear molecular context. One of the main aspects to consider for the hypothesizing of the mode of action (MoA) of a drug-target candidate is its tissue of action (ToA). Gene candidates identified through genome wide association studies (GWAS) can be contextualized into the biology of a tissue by using various bioinformatic functional annotations, gene expression and epigenomic data. |
Project Description |
There is an analytical framework to predict scores for the ToA from a GWAS meta-analysis of type 2 diabetes (T2D) [1] that the student will implement inhouse. There is also a published knowledge graph called GenomicKB [2], with all publicly available genomic information, including GWAS associations, epigenome, transcriptome and other bioinformatic annotations. The student will enrich the inhouse version of GenomicKB with the findings of the ToA scoring software and compare the KG link prediction results of the original versus the enriched KG to identify genes associated to T2D. The student will spend 8 weeks exploring: 1) the value of predicting ToA for genetic findings in helping hypothesize MoA of potentially relevant inhouse candidates, and 2) the impact that enriching a genomic KG with task-specific information has on link prediction algorithms to identify genes related to T2D. To this end, the student will first implement inhouse a software that has scored the ToA of genetic hits for T2D, and then will work with KG modelling approaches to evaluate link prediction algorithms onto the original and the enriched KG. |
Work Environment |
The student will have the possibility to come to the Novo Nordisk office in London, where they can interact with the other three members of the Machine Intelligence Dept. that are based there. In particular, with Marc Boubnouvski, who will be the main London Contact. M. Lisandra Zepeda Mendoza is based in Oxford, where the student can also visit if wished, and Lisandra will also visit the London office at an ad hoc basis for more in depth discussions with the student. For the duration of the internship, the student will also attend the Machine Intelligence Department meetings, which are held online, as the other team members are based in Seattle and Copenhagen. The student can also chose to work from home if wished. However, personal communication at the London office would be preferred. |
References | [1] Torres, Jason M., et al. "A multi-omic integrative scheme characterizes tissues of action at loci associated with type 2 diabetes." The American Journal of Human Genetics 107.6 (2020): 1011-1028. [2] Fan Feng, Feitong Tang, Yijia Gao, Dongyu Zhu, Tianjun Li, Shuyuan Yang, Yuan Yao, Yuanhao Huang, Jie Liu, GenomicKB: a knowledge graph for the human genome, Nucleic Acids Research, Volume 51, Issue D1, 6 January 2023, Pages D950-D956 |
Prerequisite Skills | Statistics, Mathematical Analysis, Geometry/Topology, coding skills (python) |
Other Skills Used in the Project | Predictive Modelling, Database Queries |
Acceptable Programming Languages | Python, R |
Internship in Quantum Computing
Project Title | Internship in Quantum Computing |
Keywords | Quantum, Computing, Algorithms, Software, Programming |
Project Listed | 5 January 2024 |
Project Status | Filled Application deadline: 21 January 2024 |
Contact Name | Emily Wild |
Contact Email | emily.wild@riverlane.com |
Company/Lab/Department | Riverlane |
Address | St Andrew's House, 59 St Andrew's Street, Cambridge, CB2 3BZ |
Project Duration | 10 - 12 weeks, full-time |
Project Open to | Master's (Part III) students |
Background Information | Riverlane’s mission is to make quantum computing useful, sooner. From climate change to healthcare, large and reliable quantum computers will help solve some of the world’s biggest challenges. Riverlane is building the quantum error correction layer to make this happen sooner. It’s a complex problem that requires a range of skills, talent and passion. We’re making remarkable progress and growing fast. |
Project Description |
Our full-time summer internships are designed to enable current students in a technical field to translate their skills and expertise into an industrial setting. You will join us at our Head Office in Cambridge, UK, for 10 to 12 weeks, where you will have the opportunity to work alongside our team of talented software and hardware engineers, mathematicians, quantum information theorists, computational chemists and physicists – all experts in their fields. Every intern will have a dedicated supervisor and will work on a project designed to make the best use of their background and skills whilst developing their knowledge of quantum computing. We will support all interns to try and produce a concrete output by the end of the internship such as a paper, product, or software tool. What you will do: Requirements For more information, please visit https://www.riverlane.com/jobs/internships To apply please visit https://apply.workable.com/riverlane/j/A4807E07E1/ or email your CV and covering letter to jobs@riverlane.com by 21 January 2024. |
Work Environment |
Everyone is welcome at Riverlane. We are an equal opportunities employer and encourage applications from eligible and suitably qualified candidates regardless of age, disability, ethnicity, gender, gender reassignment, religion or belief, sexual orientation, marital or civil partnership status, or pregnancy and maternity/paternity. Studies have shown that women tend to apply to jobs if they meet all or almost all of the requirements whereas men apply even if they meet only some of the requirements. If that sounds like you then please apply – we are happy to review your application and let you know if we think you might be a good fit. If you need any adjustments made to the application or selection process so you can do your best, please let us know. We will be happy to help. Important notes: |
References | https://www.riverlane.com/research |
Prerequisite Skills | Proven ability in computational and/or theoretical work (please see 'Project Description' for full details of requirements) |
Other Skills Used in the Project | |
Acceptable Programming Languages | No Preference |
Finite difference approximation of the Stokes flow with free interfaces on staggered Cartesian grids.
Project Title | Finite difference approximation of the Stokes flow with free interfaces on staggered Cartesian grids. |
Keywords | Stokes flow, free interface, finite difference methods, numerical linear algebra. |
Project Listed | 5 January 2024 |
Project Status | Filled |
Contact Name | Vasily Suvorov |
Contact Email | vasily.suvorov@silvaco.com |
Company/Lab/Department | Silvaco Europe, TCAD. |
Address | Silvaco Technology Centre Compass Point St Ives, Cambridgeshire, United Kingdom PE27 5JL |
Project Duration | 8 weeks, 40 hours/week |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information | A modern semiconductor technology involves processes where materials with free interfaces undergo a large and slow deformations. Such deformations can often be modelled by the incompressible Stokes flow. The project aims to analyse the company’s working numerical approach to model such flow with the aim of improving accuracy, stability and convergence. |
Project Description | Silvaco uses the finite difference schemes on the structured 2D and 3D Cartesian grids to simulate the Stokes flow with the free interfaces. A particular difficulty of applying such schemes is the approximation of the boundary conditions at the free interfaces and the approximation of the momentum equations near the interfaces. At such irregular points the finite difference stencils do not form regular, orthogonal patterns but have the irregular shapes where special approximations are applied to approximate the correspondent boundary conditions. Although such approach works in practice it requires further mathematical analysis to improve the accuracy and the stability of the numerical schemes. The student will help to better understand the approximation and stability of the current numerical schemes and possibly to suggest a better ones. A special attention will be given to the approximation of the pressure jump conditions across the interface using the approach suggested in [1]. |
Work Environment | The student will work on his/her own with the support and guidance from the supervisor. |
References | [1] K.Ito, Zh.Li and X.Wan, “Pressure jump conditions for stokes equations with discontinous viscosity in 2D and 3D”. Methods and Applications of Analysis, Vol 13, No2, pp 199-214, June 2006. |
Prerequisite Skills | Numerical Analysis, PDEs, Mathematical Analysis, Algebra/Number Theory |
Other Skills Used in the Project | |
Acceptable Programming Languages | Python, MATLAB |
Solving graph problems with a photonic quantum processor
Project Title | Solving graph problems with a photonic quantum processor |
Keywords | quantum computing, graph problems |
Project Listed | 5 January 2024 |
Project Status | Filled |
Contact Name | William Clements |
Contact Email | wclements@orcacomputing.com |
Company/Lab/Department | ORCA Computing |
Address | 30 Eastbourne Terrace, London W2 6LA |
Project Duration | 8 weeks |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information | ORCA Computing is a startup that builds quantum processors using photons. In these processors, several identical single photons are sent into a complex random circuit where they interfere with each other, and a measurement is performed to determine where they exit the circuit. Since photons are quantum particles, the output of the circuit is described by a probability distribution over all possible outcomes, and each measurement yields a sample from this distribution. Sampling from this distribution using classical (i.e. non-quantum) computational resources is hard, and with only a few tens of photons a photonic quantum processor can already outperform the world's most powerful supercomputer on this specific task. ORCA's research and development team works to develop novel applications of these processors. |
Project Description | In this project, you will implement algorithms for solving graph problems using photonic quantum processors. The evolution of single photons in photonic quantum processors is governed by a unitary transfer matrix, and the measurement probabilities at the output of this processor are determined by the properties of this matrix. When the processor is programmed such that the transfer matrix is the adjacency matrix of a graph, the measurement outputs yield information about this graph. Algorithms have thus been developed to solve several types of graph problems using photonic quantum processors, such as finding the number of perfect matchings in a graph, finding dense sub-graphs, and determining whether two graphs are isomorphic. You will study these algorithms in detail, implement them within ORCA’s python software environment, and investigate how they can be run on an ORCA quantum processor. This internship is a unique opportunity to explore the interface between mathematics, quantum physics, programming, and graph problems within one of the UK's most exciting quantum computing startups. |
Work Environment | You will be working as part of the research and development team at ORCA. You will be expected to be in the office, located near Paddington station in London, a minimum of 3 days per week, though accommodations can be made for people commuting from Cambridge. |
References | https://arxiv.org/abs/2301.09594 https://en.wikipedia.org/wiki/Boson_sampling |
Prerequisite Skills | Mathematical physics, Graph theory, Quantum computing |
Other Skills Used in the Project | |
Acceptable Programming Languages | Python |
Risk management in FX electronic trading
Project Title | Risk management in FX electronic trading |
Keywords | Finance, Algo-trading, Execution, Market Making, Portfolio Management |
Project Listed | 5 January 2024 |
Project Status | Closed Application deadline: 5 February 2024 Please see application instructions below |
Contact Name | Rashmi Tank; Tom Dobbyn |
Contact Email | rashmi.tank@barclays.com; thomas.dobbyn@barclays.com |
Company/Lab/Department | Barclays |
Address | 1 Churchill Place, Canary Wharf, London E14 5HP |
Project Duration | 8 weeks from late June |
Project Open to | Master's (Part III) students |
Background Information |
The Barclays FX electronic trading business streams tradable FX prices to institutional and corporate clients for them to transact fully electronically. This activity generates FX risk, which is also managed in an automated algorithmic fashion. There is on average a non zero cost to clearing that risk. Optimal management of risk reduces cost and allows the streamed prices to be more competitive. From a research perspective this touches on portfolio management and execution strategies for trading in wholesale electronic markets. We propose 3 distinct projects within this area. A potential intern can express interest in one or more of these. |
Project Description |
Project 1: Optimal execution schedule in the presence of exogenous price prediction signals Algorithms are often used when a participant wishes to buy or sell a given quantity of currency or security on public exchanges. These algorithms will split the total amount up into a schedule of quantity to be executed over a period of time. The general optimal execution schedule is a well studied problem. When faced with the problem of optimal execution scheduling, a market participant would typically balance a variety of factors including market impact and risk aversion. The aim of this project is to derive a theoretical extension of this framework in the presence of exogenous price prediction signals, and apply it to automated risk management of FX positions. In particular considering the case when we have both size and timescale for the price movements, or curve thereof. Project 2: Identification of persistence of iceberg liquidity levels in external markets In lit orderbook markets participants can place iceberg orders at a given price level. The total amount of these orders is not visible in the market data stack. It is possible to accurately estimate whether a particular level in the book has iceberg liquidity following a market trade. The aim of the project is to study the persistence of such iceberg price levels in time. This information can be used both in market making and execution logic. Project 3: Optimal management of multiple portfolios in the presence of portfolio specific meta data In the course of FX market making to clients, an electronic business will have a resultant position to risk manage. The composition of that position is heterogenous, and as such our risk aversion or direction view may depend upon the meta data of the underlying breakdown of that position. The purpose of this study is to establish optimal strategies for management of such a portfolio of risk, with a view to minimisation of cost of hedging. Common to all projects: The intern will have access to our historical data set of market data and trading activity. They will also have access to our python dev and compute environment for conducting this research. The internship will be at Barclay’s head office in London, and they will be working closely with the team of quants responsible for the development of trading algorithms. |
Work Environment | The internship will be on site on the trading floor at Barclay’s head office in London. The intern will be working closely with the team of quants responsible for the development of trading algorithms. |
References | |
Prerequisite Skills | Statistics, Probability / Markov Chains; Database Queries |
Other Skills Used in the Project | Simulation, Predictive Modelling; Data Visualization |
Acceptable Programming Languages | Python |
Application Instructions |
We would appreciate if you could submit your formal application via the following link: The application entails submitting your CV and some standard testing required for all Barclays applicants. Please note the following deadlines:
|
Developing an explainable segmentation tool for duodenal biopsy diagnosis from whole slide images, to maximise likelihood of clinical adoption.
Project Title | Developing an explainable segmentation tool for duodenal biopsy diagnosis from whole slide images, to maximise likelihood of clinical adoption. |
Keywords | Machine Learning, Image Segmentation, Medical Imaging, Explainable AI, Digital Pathology |
Project Listed | 10 January 2024 |
Project Status | Filled |
Contact Name | Elizabeth Soilleux |
Contact Email | ejs17@cam.ac.uk |
Company/Lab/Department | Lyzeum Ltd. |
Address | Department of Pathology, University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0QQ |
Project Duration | 8 weeks |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information | Coeliac disease is an autoimmune disorder where the immune system attacks the body after the consumption of gluten. Coeliac Disease is commonly diagnosed by pathologists looking at duodenal (small intestine) biopsies. However, studies show that the agreement between pathologists when diagnosing Coeliac Disease is only around 75 – 80%. We thus aim to develop more accurate and reliable methods to improve the diagnosis of Coeliac Disease. This project will focus on developing an accurate and explainable machine learning solution to help improve the diagnosis of biopsies to complement existing “black-box” diagnostic algorithms. |
Project Description | The project will involve using semantic and instance segmentation techniques to detect and classify two different types of cells in Whole Slide Images (scanned images of the biopsies): intra-epithelial lymphocyte (IEL) and enterocytes. The ratio of IELs to enterocytes is a strong diagnostic marker often used by pathologist to diagnose Coeliac Disease. Hence, an accurate and reliable detection algorithm for IELs and enterocytes will form a crucial element of a first-in-class, accurate and explainable, fully automated Coeliac Disease diagnostic software. The student will be able to build upon and improve existing image segmentation implementations in Python. The prospective student is expected to have prior experience programming in Python; further knowledge of Pytorch and Computer Vision are beneficial but not required. |
Work Environment | While students can work remotely on this project, we would prefer them to work at the department at least 2 days a week. |
References | https://www.sciencedirect.com/science/article/pii/S2153353922007453 https://www.nature.com/articles/s41587-021-01094-0#author-information |
Prerequisite Skills | |
Other Skills Used in the Project | Image processing |
Acceptable Programming Languages | Python |
Evaluating the impact of alternative freight methods on the transportation of cut roses from the Equator to Europe
Project Title | Evaluating the impact of alternative freight methods on the transportation of cut roses from the Equator to Europe |
Keywords | Horticulture |
Project Listed | 11 January 2024 |
Project Status | Filled |
Contact Name | Richard Boyle |
Contact Email | richard.boyle@apexhorticulture.com |
Company/Lab/Department | APEX Horticulture |
Address | MM Flowers Ltd, Pierson Road, The Enterprise Campus, Alconbury Weald, PE284YA |
Project Duration | 8 weeks |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information |
MM Flowers is a cut flower importer and distributor based in the UK and Europe. MM was established in 2007 to provide the leading UK retailers (and laterally European retailers) with a transparent, sustainable and innovative interface between the grower & the retailer. As part of a vertically integrated supply chain, MM is owned by Elite, the largest flower grower in the world, based in Colombia, Ecuador & Kenya; VP, the largest rose grower in East Africa, based in Kenya & Ethiopia; and AM Fresh, a leading supplier of fresh fruit to major retailers in the UK and Europe. Collectively, the group is involved in all aspects of the supply chain from breeding, growing, freight, marketing, and distribution. Also part of the wider group, APEX Horticulture is an independent, professional research and development business, offering bespoke services for cut flowers and plants. APEX is based in a purpose-built testing centre adjacent to MM Flowers, an optimal position in the chain ensuring APEX can deliver high quality research on the true performance of flowers and plants subjected to actual supply chain conditions. Whilst cut flowers are grown worldwide, many of those imported and sold in the UK / Europe are from Equatorial regions. Typically these flowers are packed dry in boxes and shipped to the UK / Europe by plane, although due to various factors, including the environmental impact and rising freight costs, alternative solutions to air are being explored. One of those may be transitioning the transportation of cut flowers from air to sea freight which has many benefits, but substantially increases the freight time (a 4-5 fold increase from South America to the UK, and potential a 10+ fold increase from Kenya to the UK). Whilst there are some flower types that have been successfully transitioned to sea freight, historically attempts with roses, the most important cut flower, have been unsuccessful. However, given the recent collective global focus on reducing harmful environmental practices, for the last 18 months, all of the stakeholders described above have been carrying out a series of projects to understand the impact of transporting roses (amongst other flower types) from both Kenya & Colombia to the UK & Europe by sea freight. |
Project Description |
The project work described previously has largely been conducted by APEX to understand firstly the potential of different cultivars to be transported by sea, and where there were visible impacts on either the performance or quality of the flowers, what factors / processes may be influencing this. However, commercial trials have also been undertaken on select cultivars to determine what impact the transition to sea freight may have on both the practicalities of importing and distributing flowers, and most importantly, the final consumer experience. A large amount of data is therefore available from both the project work carried out, but also initial commercial data, supported by an even more extensive historical datasets on roses transported to the UK / Europe by traditional means (air freight). Whilst a huge amount of insight has already been gained from this information, due to the sheer number of factors potentially involved (e.g. cultivar, grower, farm location, growing conditions, freight time, etc), it is believed that this can be exploited further to inform strategic decisions going forward. As such, there are a number of areas that MM Flowers, APEX and the wider group would like to explore further – • Based on the information available, are there key factors that determine the success / failure of transporting roses by sea freight? |
Work Environment | Hybrid - on site / remote |
References | |
Prerequisite Skills | |
Other Skills Used in the Project | |
Acceptable Programming Languages |
Study of ion collision models during implantation.
Project Title | Study of ion collision models during implantation. |
Keywords | Mathematical modelling, monte carlo simulation, mathematical physics, semiconductors |
Project Listed | 18 January 2024 |
Project Status | Filled |
Contact Name | Artem Babayan |
Contact Email | Artem.Babayan@silvaco.com |
Company/Lab/Department | Silvaco - TCAD process simulation |
Address | Silvaco Europe, Compass Point, St Ives, PE27 5JL |
Project Duration | 8 weeks |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information | Mathematical modelling of real-life physical problems |
Project Description |
Silvaco is the software engineering company developing the tools to assist in manufacturing of semiconductor devices. In UK office we mostly work on 'process simulation' side -- mathematical modelling of the processes used in manufacturing. One of such processes is implantation -- bombardment of piece of (typically) Si with ions (dopants), to change the electrical properties of the target in specific areas. To predict the final ion distribution we use Monte Carlo simulation -- follow the path of large number of ions, as they fly through the structure. There are several effects involved, one of them -- ions 'bouncing' off the crystal grid atoms. The collision and its effects are described by mathematical models. These models are currently approximated through the explicit empirical expressions. We are interested in testing how accurate these explicit formulas actually are and how replacing them with more sophisticated approaches may affect the simulation results. Your task would be to review the literature and to suggest and to implement the required algorithms. |
Work Environment | The project assumes the high degree of independence. The development part is expected to be done in the office (in St Ives, near Cambridge). |
References | |
Prerequisite Skills | |
Other Skills Used in the Project | Statistics, Mathematical physics, Numerical Analysis, PDEs, Simulation |
Acceptable Programming Languages | Python, MATLAB, C++ |
Diffusion Models for Context-Aware Splice Site Prediction
Project Title | Diffusion Models for Context-Aware Splice Site Prediction |
Keywords | Diffusion Modelling, Splice-Site Prediction, Genetics, Drug Target Discovery, Machine Learning |
Project Listed | 19 January 2024 |
Project Status | Filled |
Contact Name | Lewis Marsh |
Contact Email | lwmr@novonordisk.com |
Company/Lab/Department | Novo Nordisk, Digital Science and Innovation, Human Genetics CoE |
Address | The innovation Building, Roosevelt Dr, Headington, Oxford OX3 7FZ |
Project Duration | 8-10 weeks, full-time |
Project Open to | Master's (Part III) students |
Background Information | Novo Nordisk is a large pharmaceutical company with a focus on identifying novel therapies for patients with cardiometabolic diseases (e.g. diabetes mellitus, obesity). Within the Human Genetics department of Novo Nordisk, the Computational and Statistical Genomics team is tasked with developing and improving statistics and ML methods that aid the discovery of new drug targets and disease mechanisms based on genomic data. Methods that can yield such insights are ML models for splice-site prediction. Our DNA serves as a blueprint for proteins and thereby for all molecular processes in the human body. When a gene, a subsequence of DNA, is transcribed into RNA and subsequently translated into a protein, certain parts of the sequence are edited out by a process called splicing. How RNA is spliced depends on a number of factors, including the DNA sequence of the gene as well as the part of the body in which the gene is being transcribed. Differences in DNA that affect splicing can be associated with complex diseases. Understanding how genetic variation alters splicing can thus give important insights into disease management and treatment. |
Project Description |
In this project, the student will explore how we can use diffusion models (DMs) to predict splicing and splice-sites. DMs are a generative AI method that uses insights from stochastic processes, such as Markov chains, to diffuse data (i.e. add noise to data until it follows a vanilla distribution, such as a Gaussian or uniform distribution). The models then learn how to "un-diffuse" the vanilla distribution to generate an unseen data point that follows the data distribution closely. DMs have been successfully used to make predictions about proteins [1] (such as their structure) from sequence information only. The student will build a DM, similar to the ones applied in the protein space [2], and train it on splicing data, which can be modelled in a similar way. They will then benchmark their method against state-of-art splice site predictors (such as [3], which are not DMs) at making tissue-specific predictions of splice-sites. If time permits, we can also evaluate zero-shot abilities of DMs for predicting splicing by withholding certain tissue contexts during training and then evaluating predictions in unobserved tissues. Applicants should have experience with coding in Python. Experience with the PyTorch library and with version control (e.g. through git) are a plus. Prior knowledge in biology or genetics is not a requirement. |
Work Environment |
The intern will have the possibility to come to the Novo Nordisk offices in London (near King's Cross) or Oxford. The intern can work remotely or in a hybrid setting (from the UK) if they wish, although we expect them to be present in-person for on-boarding (in the first week) and the hand-over of the project (in the last week) in person. Throughout the internship the intern will attend team meetings of the Computational and Statistical Genomics team to get exposure to a broader range of research and the opportunity to interact with other scientists working on related projects. |
References | [1] Alamdari, Sarah, et al. "Protein generation with evolutionary diffusion: sequence is all you need." bioRxiv (2023): 2023-09. [2] Austin, Jacob, et al. "Structured denoising diffusion models in discrete state-spaces." Advances in Neural Information Processing Systems 34 (2021): 17981-17993. [3] Jaganathan, Kishore, et al. "Predicting splicing from primary sequence with deep learning." Cell 176.3 (2019): 535-548. |
Prerequisite Skills | Statistics, Probability/Markov Chains |
Other Skills Used in the Project | |
Acceptable Programming Languages | Python |
Mathematical models of thermal oxidation of silicon carbide: verification and calibration
Project Title | Mathematical models of thermal oxidation of silicon carbide: verification and calibration |
Keywords | Mathematical modelling, simulation, verification, calibration, oxidation |
Project Listed | 19 January 2024 |
Project Status | Filled |
Contact Name | Alexandros Kyrtsos |
Contact Email | alexandros.kyrtsos@silvaco.com |
Company/Lab/Department | Silvaco Europe, TCAD |
Address | Silvaco Technology Centre, Compass Point St Ives, Cambridgeshire, PE27 5JL |
Project Duration | 8 weeks |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information | The process simulation in semiconductor industry is a crucial tool to develop new technologies, as well as to maintain the existing processes. Thermal oxidation of silicon carbide is a way to produce a thin layer of oxide on the surface of a wafer in the fabrication of microelectronic structures and devices. The project aims to analyse, verify and calibrate the mathematical models of this process and to explore the effects of various modelling assumptions. The successful outcome of the project will become a part of the company's commercial software. |
Project Description | In 1965 Bruce Deal and Andrew Grove proposed an analytical model that satisfactorily describes the growth of an oxide layer on the plane surface of a silicon wafer [1]. This model is implemented in Silvaco's TCAD software to simulate the oxidation process in silicon carbide. The project aims to validate and calibrate the model using experimental data found in literature. In particular, the effect of the dopant's presence in silicon carbide on the oxidation kinetics will be explored. The details of the oxidation of silicon carbide can be found in [2]. |
Work Environment | The student will work independently under the guidance of the project leader. |
References | [1] Deal, B. E.; A. S. Grove (December 1965). "General Relationship for the Thermal Oxidation of Silicon". Journal of Applied Physics. 36 (12): 3770-3778. [2] https://www.iue.tuwien.ac.at/phd/simonka/index.html Thermal Oxidation and Dopant Activation of Silicon Carbide (Dissertation). |
Prerequisite Skills | Mathematical physics, Simulation, Predictive Modelling |
Other Skills Used in the Project | Simulation |
Acceptable Programming Languages | Python |
Grid impedance estimation by Machine learning / AI methods
Project Title | Grid impedance estimation by Machine learning / AI methods |
Keywords | Machine Learning, grid impedance estimation, AI, Power Electronics, Matlab / Simulink |
Project Listed | 24 January 2024 |
Project Status | Closed |
Contact Name | Rob Key |
Contact Email | robert.key@siemens.com |
Company/Lab/Department | Siemens Power Electronics Innovation Hub |
Address | Siemens plc, Varey Road, Congleton, Cheshire, CW12 1PH |
Project Duration | 8 weeks - dates flexible |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information |
In industrial contexts, a multitude of power infeed errors arises from challenges associated with grid impedance. Precise estimation of grid inductances or resistances is pivotal in preventing infeed failures, reducing downtime, and optimising control processes. The traditional technique for grid impedance estimation typically entails temporarily halting industrial drives and introducing specific signals during commissioning. However, a pioneering alternative is presented through the application of machine learning/artificial intelligence (ML/AI) techniques. This innovative approach enables continuous real-time estimation of grid impedance with an impressive precision of within +/- 10% deviation. The implementation of this ML/AI-based methodology not only eliminates the need for downtime during the estimation process but also empowers industrial systems to dynamically adapt to changing grid conditions. This advanced capability not only contributes to the seamless optimisation of the entire drive system but also acts as a pre-emptive measure against potential infeed failures. For a discerning mathematics student at Cambridge University, this cutting-edge approach offers a fascinating intersection of mathematical modelling, data analytics, and practical industrial applications, showcasing the transformative potential of ML/AI in mitigating complex challenges within the realm of power systems. |
Project Description |
Accurate estimation of grid impedance by reduced features Key Learning Outcomes: Project Outcomes: |
Work Environment | Part of the PEL Innovation Team. Lab / office based team with hybrid remote and onsite working arrangements. |
References | K. Givaki, S. Seyedzadeh and K. Givaki, "Machine learning based impedance estimation in power system," 8th Renewable Power Generation Conference (RPG 2019), Shanghai, China, 2019, pp. 1-6, doi: 10.1049/cp.2019.0683. |
Prerequisite Skills | Statistics, Simulation, Data Visualization |
Other Skills Used in the Project | Predictive Modelling, Power Electronics |
Acceptable Programming Languages | No Preference |
Reduction and efficient computation of Neural Networks
Project Title | Reduction and efficient computation of Neural Networks |
Keywords | Neural Network, Algorithms, AI, Optimisation |
Project Listed | 26 January 2024 |
Project Status | Closed |
Contact Name | Rob Key |
Contact Email | Robert.key@siemens.com |
Company/Lab/Department | Siemens Power Electronics Innovation Hub |
Address | Siemens plc, Varey Road, Congleton, Cheshire, CW12 1PH |
Project Duration | 8 weeks, full time |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information | With the ever-expanding uptake of AI, Siemens is interested in conducting a preliminary study into the background and techniques for optimising the implementation of Neural Networks and making their application more efficient. |
Project Description |
As a value driven company, Siemens is constantly trying to offer the same functionality using the smallest amount of resource. Whilst trained Deep Neural Networks (DNNs) can be incredibly effective in solving many of the industrial automation / control tasks where our products are often deployed, the size and computational overhead required can become prohibitive. We would like a student to explore DNN compression techniques which greatly decrease the number of Neurons required to implement a given solution, with an aim to realise a solution on a tightly restricted processor by using the packages provided by the Fraunhofer in the given link. In addition to the compression, in the same way that ‘Divide and Conquer’ algorithms can drastically reduce the computational cost of the discreet Fourier transform; we would like the student to try to generalise the problem of Neural Network computation in such a way that the number of operations required to implement the inference part of the Neural Network can be reduced. |
Work Environment | Lab / office based - flexible hybrid |
References | https://www.hhi.fraunhofer.de/en/departments/ai/research-groups/efficient-deep-learning/research-topics/neural-network-reduction-and-optimization.html https://en.wikipedia.org/wiki/Divide-and-conquer_algorithm https://lunalux.io/computational-complexity-of-neural-networks/ |
Prerequisite Skills | Numerical Analysis, Mathematical Analysis, Algebra/Number Theory |
Other Skills Used in the Project | Simulation, Predictive Modelling |
Acceptable Programming Languages | No Preference |
Pricing and hedging short term interest rate options
Project Title | Pricing and hedging short term interest rate options |
Keywords | Interest Rates, Options, Pricing, Implied Volatility Fitting, Fixed Income, Black Scholes, Normal vs LogNormal distributions, Exchange Trading, Over-The-Counter Trading, Relative Value |
Project Listed | 1 February 2024; Please contact us by mid-February if you are interested |
Project Status | Closed |
Contact Name | Silvia Stanescu / Yesmine Bahloul |
Contact Email | Silvia.stanescu@aspectcapital.com; Yesmine.bahloul@aspectcapital.com |
Company/Lab/Department | Aspect Capital |
Address | 10 Portman Square, London W1H 6AZ |
Project Duration | July-September full-time, we are happy to discuss any special requirements |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information |
Short-term interest rate (STIR) options are derivative contracts which give the option holder (long party) the right to a payout based on the difference between the underlying rate on expiry of the option and a pre-set level for the rate (i.e. the strike price of the option). The right comes at a price, the premium for the option – which depends on the strike and the time to maturity of the option, inter alia. Equivalently, for a given strike and time to maturity, the option premium can be mapped to an (implied) volatility level, and consequently, to a volatility surface for the strike and time to maturity domains. Short-term interest rate volatility opportunities increased with the volatility surface shifting regime since late21/early22. For example, taking a benchmark European short-term interest rate (Euribor or more recently Estr) option markets were typically pricing 1 to 2 basis point move per day pre-hiking cycle vs 7bps/day+ territory more recently. In low carry environment, fixed income investors can absorb a relatively small rise in realised volatility before hedging/reducing exposure (which, once triggered, can contribute to further hedging). The new regime makes it particularly interesting to explore those products valuation vs more liquid instruments. |
Project Description |
The expectations of the project are: No prior knowledge of the instruments is required but a basic understanding (or interest) in the Fixed Income market and products (futures, otc swaps) will definitely help. We will provide sample data: premium and implied volatility of capfloors and future options + required interest rate curves. These are the different tasks for the project that we can adapt to the student/context: |
Work Environment | The student will be mentored by at least one person from the volatility team. The team includes 4 researchers and the student will be encouraged to interact with all of the team as well as other research teams if/when needed. |
References | Some resources to get familiar with the topic: - Handbook of volatility models and their applications, Bauwens, Hafner, Laurent - Pricing and Hedging Interest Rate Options: Evidence from Cap-Floor Markets – Gupta2002 - Valuing Interest Rate Caps and Floors Using QuantLib Python: by Goutham Balaraman - Interest Rate Options Conventions from the AFMA association - Interest Rate Caps and Floors Primer - FinPricing |
Prerequisite Skills | Numerical Analysis, PDEs, Mathematical Analysis, Simulation, Database Queries, Data Visualisation |
Other Skills Used in the Project | Probability / Markov Chains, Algebra / Number Theory, Predictive Modelling |
Acceptable Programming Languages | Python, MATLAB |
Changepoint Detection in Financial Time Series
Project Title | Changepoint Detection in Financial Time Series |
Keywords | Changepoint detection, financial Mathematics, Time series analysis |
Project Listed | 10 April 2024 |
Project Status | Filled Please apply by 1 May 2024 |
Contact Name | Chris Hunter and Que Vuong-Crouzier |
Contact Email | qvuong@pharo.com |
Company/Lab/Department | Pharo Management |
Address | 8 Lancelot Place, London, SW7 1DR |
Project Duration | 8 weeks, full time |
Project Open to | Undergraduates, Master's (Part III) students |
Background Information | Change point detection (CPD) aims to detect sudden changes in the properties of timeseries data. For example, the level of volatility or the correlation between two time-series. This is important for financial applications, as markets often switch between different regimes, resulting in different behaviour across the traded assets. Many existing tools are built on assumptions of stationarity and are slow to adapt, whereas CPD methods can adapt quickly when the data exhibits change. |
Project Description |
There are several approaches to CPD, including online vs offline and Bayesian vs Frequentist vs non-parametric. The student should first conduct a literature review of the various methodologies and then investigate one or more specific methods in depth using financial data. Some examples are: - Online CPD methods for measuring asset volatility and correlation |
Work Environment | The student will ideally work from Pharo's office on a full-time basis. The project will be done independently with guidance from the team |
References | https://techrando.com/2019/08/14/a-brief-introduction-to-change-point-detection-using-python/ https://arxiv.org/abs/2003.06222 https://arxiv.org/abs/1906.10372 |
Prerequisite Skills | Statistics |
Other Skills Used in the Project | Statistics |
Acceptable Programming Languages | Python |