skip to content

Summer Research Programmes

 

Below you will find the list of CMP industrial projects hosted by external companies.  Click here to see the list of academic projects hosted by other departments and labs within the university.  

New projects may be added so to check back regularly!

You can enquire about the projects you are interested in by writing to the contact given in the project listing.  Unless alternative instructions are given in the project listing, to apply for a project you should send your CV to the contact along with a covering email which says why you are interested in the project and why you think you would be a good fit.  Please note, some projects have an earlier application deadline than the general CMP deadline of Friday 23 February 2024. 

For tips on making strong placement applications see the advice here from CMP Co-Founder James Bridgwater. It’s better to put the work into making fewer but stronger applications than firing off a very generic application to all projects – you won’t stand out with the latter approach!  

At the CMP Lunchtime Seminar Series in February 2024 many hosts gave short presentations about their projects.  Where possible these were recorded and links to the videos posted on the seminar webpage. 

Please note that to participate in the CMP programme you must be a student in Part IB, Part II, or Part III of the Mathematical Tripos at Cambridge.

 

Industrial CMP Project Proposals for Summer 2024

 

Providing genomic and molecular network context for in-silico predicted drug-target candidates to aid hypothesizing their mode of action

Project Title Providing genomic and molecular network context for in-silico predicted drug-target candidates to aid hypothesizing their mode of action
Keywords Genome-wide association study, drug discovery, mode of action, biomedical knowledge graph, graph link prediction
Project Listed 5 January 2024
Project Status Filled
Contact Name Marie Lisandra Zepeda Mendoza
Contact Email vmnz@novonordisk.com
Company/Lab/Department Novo Nordisk, Digital Science and Innovation, Machine Intelligence
Address 1 Pancras Square, London, N1C 4AG
Project Duration 8 weeks
Project Open to Master's (Part III) students
Background Information

Novo Nordisk Research Centre Oxford (NNRCO) is an innovative target discovery and translational research unit with a focus on identifying novel therapies for patients with cardiometabolic diseases (e.g. diabetes mellitus, obesity). To identify novel drug targets, we employ a variety of advanced computational biology techniques (graph data science and machine learning) on a myriad of data sources (genomics, transcriptomics, etc).

In particular, the Machine Intelligence group, has developed graph-based machine learning models to perform link prediction onto knowledge graphs (KGs) and has explored the positive impact that enriching a KG with task-specific information to explore the molecular landscape of complex pathologies. Genetic information provides direct evidence of the relevance of a drug-target candidate to a disease, however, the identifications come without a clear molecular context.

One of the main aspects to consider for the hypothesizing of the mode of action (MoA) of a drug-target candidate is its tissue of action (ToA). Gene candidates identified through genome wide association studies (GWAS) can be contextualized into the biology of a tissue by using various bioinformatic functional annotations, gene expression and epigenomic data.

Project Description

There is an analytical framework to predict scores for the ToA from a GWAS meta-analysis of type 2 diabetes (T2D) [1] that the student will implement inhouse. There is also a published knowledge graph called GenomicKB [2], with all publicly available genomic information, including GWAS associations, epigenome, transcriptome and other bioinformatic annotations. The student will enrich the inhouse version of GenomicKB with the findings of the ToA scoring software and compare the KG link prediction results of the original versus the enriched KG to identify genes associated to T2D.

The student will spend 8 weeks exploring: 1) the value of predicting ToA for genetic findings in helping hypothesize MoA of potentially relevant inhouse candidates, and 2) the impact that enriching a genomic KG with task-specific information has on link prediction algorithms to identify genes related to T2D. To this end, the student will first implement inhouse a software that has scored the ToA of genetic hits for T2D, and then will work with KG modelling approaches to evaluate link prediction algorithms onto the original and the enriched KG.

Work Environment

The student will have the possibility to come to the Novo Nordisk office in London, where they can interact with the other three members of the Machine Intelligence Dept. that are based there. In particular, with Marc Boubnouvski, who will be the main London Contact. M. Lisandra Zepeda Mendoza is based in Oxford, where the student can also visit if wished, and Lisandra will also visit the London office at an ad hoc basis for more in depth discussions with the student.

For the duration of the internship, the student will also attend the Machine Intelligence Department meetings, which are held online, as the other team members are based in Seattle and Copenhagen.

The student can also chose to work from home if wished. However, personal communication at the London office would be preferred.

References [1] Torres, Jason M., et al. "A multi-omic integrative scheme characterizes tissues of action at loci associated with type 2 diabetes." The American Journal of Human Genetics 107.6 (2020): 1011-1028.
[2] Fan Feng, Feitong Tang, Yijia Gao, Dongyu Zhu, Tianjun Li, Shuyuan Yang, Yuan Yao, Yuanhao Huang, Jie Liu, GenomicKB: a knowledge graph for the human genome, Nucleic Acids Research, Volume 51, Issue D1, 6 January 2023, Pages D950-D956
Prerequisite Skills Statistics, Mathematical Analysis, Geometry/Topology, coding skills (python)
Other Skills Used in the Project Predictive Modelling, Database Queries
Acceptable Programming Languages Python, R

 

Internship in Quantum Computing

Project Title Internship in Quantum Computing
Keywords Quantum, Computing, Algorithms, Software, Programming
Project Listed 5 January 2024
Project Status Filled
Application deadline: 21 January 2024
Contact Name Emily Wild
Contact Email emily.wild@riverlane.com
Company/Lab/Department Riverlane
Address St Andrew's House, 59 St Andrew's Street, Cambridge, CB2 3BZ
Project Duration 10 - 12 weeks, full-time
Project Open to Master's (Part III) students
Background Information Riverlane’s mission is to make quantum computing useful, sooner. From climate change to healthcare, large and reliable quantum computers will help solve some of the world’s biggest challenges. Riverlane is building the quantum error correction layer to make this happen sooner. It’s a complex problem that requires a range of skills, talent and passion. We’re making remarkable progress and growing fast.
Project Description

Our full-time summer internships are designed to enable current students in a technical field to translate their skills and expertise into an industrial setting. You will join us at our Head Office in Cambridge, UK, for 10 to 12 weeks, where you will have the opportunity to work alongside our team of talented software and hardware engineers, mathematicians, quantum information theorists, computational chemists and physicists – all experts in their fields.

Every intern will have a dedicated supervisor and will work on a project designed to make the best use of their background and skills whilst developing their knowledge of quantum computing. We will support all interns to try and produce a concrete output by the end of the internship such as a paper, product, or software tool.

What you will do:
- Develop, devise and research algorithms and software to enhance Riverlane’s capabilities, contributing to one or more projects that are core to Riverlane’s goals
- Discuss ideas with colleagues and communicate work in the form of presentations and reports
- Develop an understanding of quantum computers and their industrial applications

Requirements
- A current student studying for a Master’s degree (including the final year of an integrated undergraduate and Master’s degree)
- Proven ability in computational and/ or theoretical work
- Experience with at least one programming language
- Excellent critical thinking and problem-solving ability
- Strong communication skills, both written and verbal
- Ability to take initiative and to work well as part of a team
- An interest in quantum computing (extensive knowledge or experience is not required)

For more information, please visit https://www.riverlane.com/jobs/internships To apply please visit https://apply.workable.com/riverlane/j/A4807E07E1/ or email your CV and covering letter to jobs@riverlane.com by 21 January 2024.

Work Environment

Everyone is welcome at Riverlane. We are an equal opportunities employer and encourage applications from eligible and suitably qualified candidates regardless of age, disability, ethnicity, gender, gender reassignment, religion or belief, sexual orientation, marital or civil partnership status, or pregnancy and maternity/paternity.

Studies have shown that women tend to apply to jobs if they meet all or almost all of the requirements whereas men apply even if they meet only some of the requirements. If that sounds like you then please apply – we are happy to review your application and let you know if we think you might be a good fit.

If you need any adjustments made to the application or selection process so you can do your best, please let us know. We will be happy to help.

Important notes:
1. We are only able to consider applications from (i) current university students who are UK/ Irish nationals OR who have UK/ Irish settled/ pre-settled status OR Master's students who are on a Tier 4 visa.
2. Regrettably ware unable to provide letters of sponsorship or letters of support for sponsorship applications for your internship scheme. You must be available full-time for 10 to 12 weeks over the summer, preferably between late June and late September 2024. 3. We require a signed agreement from you assigning the ownership of any IP produced during your internship to Riverlane. 4. Internships are based at our Head Office in Cambridge, UK.

References https://www.riverlane.com/research
Prerequisite Skills Proven ability in computational and/or theoretical work (please see 'Project Description' for full details of requirements)
Other Skills Used in the Project  
Acceptable Programming Languages No Preference

 

Finite difference approximation of the Stokes flow with free interfaces on staggered Cartesian grids.

Project Title Finite difference approximation of the Stokes flow with free interfaces on staggered Cartesian grids.
Keywords Stokes flow, free interface, finite difference methods, numerical linear algebra.
Project Listed 5 January 2024
Project Status Filled
Contact Name Vasily Suvorov
Contact Email vasily.suvorov@silvaco.com
Company/Lab/Department Silvaco Europe, TCAD.
Address Silvaco Technology Centre Compass Point St Ives, Cambridgeshire, United Kingdom PE27 5JL
Project Duration 8 weeks, 40 hours/week
Project Open to Undergraduates, Master's (Part III) students
Background Information A modern semiconductor technology involves processes where materials with free interfaces undergo a large and slow deformations. Such deformations can often be modelled by the incompressible Stokes flow. The project aims to analyse the company’s working numerical approach to model such flow with the aim of improving accuracy, stability and convergence.
Project Description Silvaco uses the finite difference schemes on the structured 2D and 3D Cartesian grids to simulate the Stokes flow with the free interfaces. A particular difficulty of applying such schemes is the approximation of the boundary conditions at the free interfaces and the approximation of the momentum equations near the interfaces. At such irregular points the finite difference stencils do not form regular, orthogonal patterns but have the irregular shapes where special approximations are applied to approximate the correspondent boundary conditions. Although such approach works in practice it requires further mathematical analysis to improve the accuracy and the stability of the numerical schemes. The student will help to better understand the approximation and stability of the current numerical schemes and possibly to suggest a better ones. A special attention will be given to the approximation of the pressure jump conditions across the interface using the approach suggested in [1].
Work Environment The student will work on his/her own with the support and guidance from the supervisor.
References [1] K.Ito, Zh.Li and X.Wan, “Pressure jump conditions for stokes equations with discontinous viscosity in 2D and 3D”. Methods and Applications of Analysis, Vol 13, No2, pp 199-214, June 2006.
Prerequisite Skills Numerical Analysis, PDEs, Mathematical Analysis, Algebra/Number Theory
Other Skills Used in the Project  
Acceptable Programming Languages Python, MATLAB

 

Solving graph problems with a photonic quantum processor

Project Title Solving graph problems with a photonic quantum processor
Keywords quantum computing, graph problems
Project Listed 5 January 2024
Project Status Filled
Contact Name William Clements
Contact Email wclements@orcacomputing.com
Company/Lab/Department ORCA Computing
Address 30 Eastbourne Terrace, London W2 6LA
Project Duration 8 weeks
Project Open to Undergraduates, Master's (Part III) students
Background Information ORCA Computing is a startup that builds quantum processors using photons. In these processors, several identical single photons are sent into a complex random circuit where they interfere with each other, and a measurement is performed to determine where they exit the circuit. Since photons are quantum particles, the output of the circuit is described by a probability distribution over all possible outcomes, and each measurement yields a sample from this distribution. Sampling from this distribution using classical (i.e. non-quantum) computational resources is hard, and with only a few tens of photons a photonic quantum processor can already outperform the world's most powerful supercomputer on this specific task. ORCA's research and development team works to develop novel applications of these processors.
Project Description In this project, you will implement algorithms for solving graph problems using photonic quantum processors. The evolution of single photons in photonic quantum processors is governed by a unitary transfer matrix, and the measurement probabilities at the output of this processor are determined by the properties of this matrix. When the processor is programmed such that the transfer matrix is the adjacency matrix of a graph, the measurement outputs yield information about this graph. Algorithms have thus been developed to solve several types of graph problems using photonic quantum processors, such as finding the number of perfect matchings in a graph, finding dense sub-graphs, and determining whether two graphs are isomorphic. You will study these algorithms in detail, implement them within ORCA’s python software environment, and investigate how they can be run on an ORCA quantum processor. This internship is a unique opportunity to explore the interface between mathematics, quantum physics, programming, and graph problems within one of the UK's most exciting quantum computing startups.
Work Environment You will be working as part of the research and development team at ORCA. You will be expected to be in the office, located near Paddington station in London, a minimum of 3 days per week, though accommodations can be made for people commuting from Cambridge.
References https://arxiv.org/abs/2301.09594
https://en.wikipedia.org/wiki/Boson_sampling
Prerequisite Skills Mathematical physics, Graph theory, Quantum computing
Other Skills Used in the Project  
Acceptable Programming Languages Python

 

Risk management in FX electronic trading

Project Title Risk management in FX electronic trading
Keywords Finance, Algo-trading, Execution, Market Making, Portfolio Management
Project Listed 5 January 2024
Project Status Closed
Application deadline: 5 February 2024
Please see application instructions below
Contact Name Rashmi Tank; Tom Dobbyn
Contact Email rashmi.tank@barclays.com; thomas.dobbyn@barclays.com
Company/Lab/Department Barclays
Address 1 Churchill Place, Canary Wharf, London E14 5HP
Project Duration 8 weeks from late June
Project Open to Master's (Part III) students
Background Information

The Barclays FX electronic trading business streams tradable FX prices to institutional and corporate clients for them to transact fully electronically. This activity generates FX risk, which is also managed in an automated algorithmic fashion. There is on average a non zero cost to clearing that risk.

Optimal management of risk reduces cost and allows the streamed prices to be more competitive. From a research perspective this touches on portfolio management and execution strategies for trading in wholesale electronic markets.

We propose 3 distinct projects within this area. A potential intern can express interest in one or more of these.

Project Description

Project 1: Optimal execution schedule in the presence of exogenous price prediction signals

Algorithms are often used when a participant wishes to buy or sell a given quantity of currency or security on public exchanges. These algorithms will split the total amount up into a schedule of quantity to be executed over a period of time. The general optimal execution schedule is a well studied problem. When faced with the problem of optimal execution scheduling, a market participant would typically balance a variety of factors including market impact and risk aversion. The aim of this project is to derive a theoretical extension of this framework in the presence of exogenous price prediction signals, and apply it to automated risk management of FX positions. In particular considering the case when we have both size and timescale for the price movements, or curve thereof.

Project 2: Identification of persistence of iceberg liquidity levels in external markets

In lit orderbook markets participants can place iceberg orders at a given price level. The total amount of these orders is not visible in the market data stack. It is possible to accurately estimate whether a particular level in the book has iceberg liquidity following a market trade. The aim of the project is to study the persistence of such iceberg price levels in time. This information can be used both in market making and execution logic.

Project 3: Optimal management of multiple portfolios in the presence of portfolio specific meta data

In the course of FX market making to clients, an electronic business will have a resultant position to risk manage. The composition of that position is heterogenous, and as such our risk aversion or direction view may depend upon the meta data of the underlying breakdown of that position. The purpose of this study is to establish optimal strategies for management of such a portfolio of risk, with a view to minimisation of cost of hedging. Common to all projects: The intern will have access to our historical data set of market data and trading activity. They will also have access to our python dev and compute environment for conducting this research. The internship will be at Barclay’s head office in London, and they will be working closely with the team of quants responsible for the development of trading algorithms.

Work Environment The internship will be on site on the trading floor at Barclay’s head office in London. The intern will be working closely with the team of quants responsible for the development of trading algorithms.
References  
Prerequisite Skills Statistics, Probability / Markov Chains; Database Queries
Other Skills Used in the Project Simulation, Predictive Modelling; Data Visualization
Acceptable Programming Languages Python
Application Instructions

We would appreciate if you could submit your formal application via the following link:
https://search.jobs.barclays/job/london/electronic-trading-associate-off-cycle-internship-2024-london/13015/54029345952

The application entails submitting your CV and some standard testing required for all Barclays applicants.

Please note the following deadlines:

  • Applications (CV submission, candidate details, online assessments) must be completed via the above link by the deadline of 5 February 2024.
  • Successful candidates will receive, as part of the recruitment process, an online assessment to showcase their technical skills by solving job-related problems. Deadline to complete the assessment is 13 February 2024.
  • Successful candidates will receive an invitation to take part in person in an Assessment Centre at Barclays, which is planned to be held in late February.

 

Developing an explainable segmentation tool for duodenal biopsy diagnosis from whole slide images, to maximise likelihood of clinical adoption.

Project Title Developing an explainable segmentation tool for duodenal biopsy diagnosis from whole slide images, to maximise likelihood of clinical adoption.
Keywords Machine Learning, Image Segmentation, Medical Imaging, Explainable AI, Digital Pathology
Project Listed 10 January 2024
Project Status Filled
Contact Name Elizabeth Soilleux
Contact Email ejs17@cam.ac.uk
Company/Lab/Department Lyzeum Ltd.
Address Department of Pathology, University of Cambridge, Cambridge Biomedical Campus, Cambridge CB2 0QQ
Project Duration 8 weeks
Project Open to Undergraduates, Master's (Part III) students
Background Information Coeliac disease is an autoimmune disorder where the immune system attacks the body after the consumption of gluten. Coeliac Disease is commonly diagnosed by pathologists looking at duodenal (small intestine) biopsies. However, studies show that the agreement between pathologists when diagnosing Coeliac Disease is only around 75 – 80%. We thus aim to develop more accurate and reliable methods to improve the diagnosis of Coeliac Disease. This project will focus on developing an accurate and explainable machine learning solution to help improve the diagnosis of biopsies to complement existing “black-box” diagnostic algorithms.
Project Description The project will involve using semantic and instance segmentation techniques to detect and classify two different types of cells in Whole Slide Images (scanned images of the biopsies): intra-epithelial lymphocyte (IEL) and enterocytes. The ratio of IELs to enterocytes is a strong diagnostic marker often used by pathologist to diagnose Coeliac Disease. Hence, an accurate and reliable detection algorithm for IELs and enterocytes will form a crucial element of a first-in-class, accurate and explainable, fully automated Coeliac Disease diagnostic software. The student will be able to build upon and improve existing image segmentation implementations in Python. The prospective student is expected to have prior experience programming in Python; further knowledge of Pytorch and Computer Vision are beneficial but not required.
Work Environment While students can work remotely on this project, we would prefer them to work at the department at least 2 days a week.
References https://www.sciencedirect.com/science/article/pii/S2153353922007453
https://www.nature.com/articles/s41587-021-01094-0#author-information
Prerequisite Skills  
Other Skills Used in the Project Image processing
Acceptable Programming Languages Python

 

Evaluating the impact of alternative freight methods on the transportation of cut roses from the Equator to Europe

Project Title Evaluating the impact of alternative freight methods on the transportation of cut roses from the Equator to Europe
Keywords Horticulture
Project Listed 11 January 2024
Project Status Filled
Contact Name Richard Boyle
Contact Email richard.boyle@apexhorticulture.com
Company/Lab/Department APEX Horticulture
Address MM Flowers Ltd, Pierson Road, The Enterprise Campus, Alconbury Weald, PE284YA
Project Duration 8 weeks
Project Open to Undergraduates, Master's (Part III) students
Background Information

MM Flowers is a cut flower importer and distributor based in the UK and Europe. MM was established in 2007 to provide the leading UK retailers (and laterally European retailers) with a transparent, sustainable and innovative interface between the grower & the retailer. As part of a vertically integrated supply chain, MM is owned by Elite, the largest flower grower in the world, based in Colombia, Ecuador & Kenya; VP, the largest rose grower in East Africa, based in Kenya & Ethiopia; and AM Fresh, a leading supplier of fresh fruit to major retailers in the UK and Europe. Collectively, the group is involved in all aspects of the supply chain from breeding, growing, freight, marketing, and distribution. Also part of the wider group, APEX Horticulture is an independent, professional research and development business, offering bespoke services for cut flowers and plants. APEX is based in a purpose-built testing centre adjacent to MM Flowers, an optimal position in the chain ensuring APEX can deliver high quality research on the true performance of flowers and plants subjected to actual supply chain conditions.

Whilst cut flowers are grown worldwide, many of those imported and sold in the UK / Europe are from Equatorial regions. Typically these flowers are packed dry in boxes and shipped to the UK / Europe by plane, although due to various factors, including the environmental impact and rising freight costs, alternative solutions to air are being explored. One of those may be transitioning the transportation of cut flowers from air to sea freight which has many benefits, but substantially increases the freight time (a 4-5 fold increase from South America to the UK, and potential a 10+ fold increase from Kenya to the UK). Whilst there are some flower types that have been successfully transitioned to sea freight, historically attempts with roses, the most important cut flower, have been unsuccessful. However, given the recent collective global focus on reducing harmful environmental practices, for the last 18 months, all of the stakeholders described above have been carrying out a series of projects to understand the impact of transporting roses (amongst other flower types) from both Kenya & Colombia to the UK & Europe by sea freight.

Project Description

The project work described previously has largely been conducted by APEX to understand firstly the potential of different cultivars to be transported by sea, and where there were visible impacts on either the performance or quality of the flowers, what factors / processes may be influencing this. However, commercial trials have also been undertaken on select cultivars to determine what impact the transition to sea freight may have on both the practicalities of importing and distributing flowers, and most importantly, the final consumer experience. A large amount of data is therefore available from both the project work carried out, but also initial commercial data, supported by an even more extensive historical datasets on roses transported to the UK / Europe by traditional means (air freight). Whilst a huge amount of insight has already been gained from this information, due to the sheer number of factors potentially involved (e.g. cultivar, grower, farm location, growing conditions, freight time, etc), it is believed that this can be exploited further to inform strategic decisions going forward. As such, there are a number of areas that MM Flowers, APEX and the wider group would like to explore further –

• Based on the information available, are there key factors that determine the success / failure of transporting roses by sea freight?
• Do specific growers / farms produce roses that are better suited to sea freight transportation?
• Can MM Flowers data (such as quality assurance inspections, and sales, waste & complaints) be used to determine the success of commercially transporting select rose cultivars to the UK?

Work Environment Hybrid - on site / remote
References  
Prerequisite Skills  
Other Skills Used in the Project  
Acceptable Programming Languages  

 

Study of ion collision models during implantation.

Project Title Study of ion collision models during implantation.
Keywords Mathematical modelling, monte carlo simulation, mathematical physics, semiconductors
Project Listed 18 January 2024
Project Status Filled
Contact Name Artem Babayan
Contact Email Artem.Babayan@silvaco.com
Company/Lab/Department Silvaco - TCAD process simulation
Address Silvaco Europe, Compass Point, St Ives, PE27 5JL
Project Duration 8 weeks
Project Open to Undergraduates, Master's (Part III) students
Background Information Mathematical modelling of real-life physical problems
Project Description

Silvaco is the software engineering company developing the tools to assist in manufacturing of semiconductor devices. In UK office we mostly work on 'process simulation' side -- mathematical modelling of the processes used in manufacturing.

One of such processes is implantation -- bombardment of piece of (typically) Si with ions (dopants), to change the electrical properties of the target in specific areas. To predict the final ion distribution we use Monte Carlo simulation -- follow the path of large number of ions, as they fly through the structure. There are several effects involved, one of them -- ions 'bouncing' off the crystal grid atoms. The collision and its effects are described by mathematical models. These models are currently approximated through the explicit empirical expressions. We are interested in testing how accurate these explicit formulas actually are and how replacing them with more sophisticated approaches may affect the simulation results.

Your task would be to review the literature and to suggest and to implement the required algorithms.

Work Environment The project assumes the high degree of independence. The development part is expected to be done in the office (in St Ives, near Cambridge).
References  
Prerequisite Skills  
Other Skills Used in the Project Statistics, Mathematical physics, Numerical Analysis, PDEs, Simulation
Acceptable Programming Languages Python, MATLAB, C++

 

Diffusion Models for Context-Aware Splice Site Prediction

Project Title Diffusion Models for Context-Aware Splice Site Prediction
Keywords Diffusion Modelling, Splice-Site Prediction, Genetics, Drug Target Discovery, Machine Learning
Project Listed 19 January 2024
Project Status Open
Contact Name Lewis Marsh
Contact Email lwmr@novonordisk.com
Company/Lab/Department Novo Nordisk, Digital Science and Innovation, Human Genetics CoE
Address The innovation Building, Roosevelt Dr, Headington, Oxford OX3 7FZ
Project Duration 8-10 weeks, full-time
Project Open to Master's (Part III) students
Background Information Novo Nordisk is a large pharmaceutical company with a focus on identifying novel therapies for patients with cardiometabolic diseases (e.g. diabetes mellitus, obesity). Within the Human Genetics department of Novo Nordisk, the Computational and Statistical Genomics team is tasked with developing and improving statistics and ML methods that aid the discovery of new drug targets and disease mechanisms based on genomic data. Methods that can yield such insights are ML models for splice-site prediction. Our DNA serves as a blueprint for proteins and thereby for all molecular processes in the human body. When a gene, a subsequence of DNA, is transcribed into RNA and subsequently translated into a protein, certain parts of the sequence are edited out by a process called splicing. How RNA is spliced depends on a number of factors, including the DNA sequence of the gene as well as the part of the body in which the gene is being transcribed. Differences in DNA that affect splicing can be associated with complex diseases. Understanding how genetic variation alters splicing can thus give important insights into disease management and treatment.
Project Description

In this project, the student will explore how we can use diffusion models (DMs) to predict splicing and splice-sites. DMs are a generative AI method that uses insights from stochastic processes, such as Markov chains, to diffuse data (i.e. add noise to data until it follows a vanilla distribution, such as a Gaussian or uniform distribution). The models then learn how to "un-diffuse" the vanilla distribution to generate an unseen data point that follows the data distribution closely. DMs have been successfully used to make predictions about proteins [1] (such as their structure) from sequence information only. The student will build a DM, similar to the ones applied in the protein space [2], and train it on splicing data, which can be modelled in a similar way. They will then benchmark their method against state-of-art splice site predictors (such as [3], which are not DMs) at making tissue-specific predictions of splice-sites. If time permits, we can also evaluate zero-shot abilities of DMs for predicting splicing by withholding certain tissue contexts during training and then evaluating predictions in unobserved tissues.

Applicants should have experience with coding in Python. Experience with the PyTorch library and with version control (e.g. through git) are a plus. Prior knowledge in biology or genetics is not a requirement.

Work Environment

The intern will have the possibility to come to the Novo Nordisk offices in London (near King's Cross) or Oxford. The intern can work remotely or in a hybrid setting (from the UK) if they wish, although we expect them to be present in-person for on-boarding (in the first week) and the hand-over of the project (in the last week) in person.

Throughout the internship the intern will attend team meetings of the Computational and Statistical Genomics team to get exposure to a broader range of research and the opportunity to interact with other scientists working on related projects.

References [1] Alamdari, Sarah, et al. "Protein generation with evolutionary diffusion: sequence is all you need." bioRxiv (2023): 2023-09.
[2] Austin, Jacob, et al. "Structured denoising diffusion models in discrete state-spaces." Advances in Neural Information Processing Systems 34 (2021): 17981-17993.
[3] Jaganathan, Kishore, et al. "Predicting splicing from primary sequence with deep learning." Cell 176.3 (2019): 535-548.
Prerequisite Skills Statistics, Probability/Markov Chains
Other Skills Used in the Project  
Acceptable Programming Languages Python

 

Mathematical models of thermal oxidation of silicon carbide: verification and calibration

Project Title Mathematical models of thermal oxidation of silicon carbide: verification and calibration
Keywords Mathematical modelling, simulation, verification, calibration, oxidation
Project Listed 19 January 2024
Project Status Filled
Contact Name Alexandros Kyrtsos
Contact Email alexandros.kyrtsos@silvaco.com
Company/Lab/Department Silvaco Europe, TCAD
Address Silvaco Technology Centre, Compass Point St Ives, Cambridgeshire, PE27 5JL
Project Duration 8 weeks
Project Open to Undergraduates, Master's (Part III) students
Background Information The process simulation in semiconductor industry is a crucial tool to develop new technologies, as well as to maintain the existing processes. Thermal oxidation of silicon carbide is a way to produce a thin layer of oxide on the surface of a wafer in the fabrication of microelectronic structures and devices. The project aims to analyse, verify and calibrate the mathematical models of this process and to explore the effects of various modelling assumptions. The successful outcome of the project will become a part of the company's commercial software.
Project Description In 1965 Bruce Deal and Andrew Grove proposed an analytical model that satisfactorily describes the growth of an oxide layer on the plane surface of a silicon wafer [1]. This model is implemented in Silvaco's TCAD software to simulate the oxidation process in silicon carbide. The project aims to validate and calibrate the model using experimental data found in literature. In particular, the effect of the dopant's presence in silicon carbide on the oxidation kinetics will be explored. The details of the oxidation of silicon carbide can be found in [2].
Work Environment The student will work independently under the guidance of the project leader.
References [1] Deal, B. E.; A. S. Grove (December 1965). "General Relationship for the Thermal Oxidation of Silicon". Journal of Applied Physics. 36 (12): 3770-3778.
[2] https://www.iue.tuwien.ac.at/phd/simonka/index.html Thermal Oxidation and Dopant Activation of Silicon Carbide (Dissertation).
Prerequisite Skills Mathematical physics, Simulation, Predictive Modelling
Other Skills Used in the Project Simulation
Acceptable Programming Languages Python

 

Grid impedance estimation by Machine learning / AI methods

Project Title Grid impedance estimation by Machine learning / AI methods
Keywords Machine Learning, grid impedance estimation, AI, Power Electronics, Matlab / Simulink
Project Listed 24 January 2024
Project Status Open
Contact Name Rob Key
Contact Email robert.key@siemens.com
Company/Lab/Department Siemens Power Electronics Innovation Hub
Address Siemens plc, Varey Road, Congleton, Cheshire, CW12 1PH
Project Duration 8 weeks - dates flexible
Project Open to Undergraduates, Master's (Part III) students
Background Information

In industrial contexts, a multitude of power infeed errors arises from challenges associated with grid impedance. Precise estimation of grid inductances or resistances is pivotal in preventing infeed failures, reducing downtime, and optimising control processes. The traditional technique for grid impedance estimation typically entails temporarily halting industrial drives and introducing specific signals during commissioning.

However, a pioneering alternative is presented through the application of machine learning/artificial intelligence (ML/AI) techniques. This innovative approach enables continuous real-time estimation of grid impedance with an impressive precision of within +/- 10% deviation. The implementation of this ML/AI-based methodology not only eliminates the need for downtime during the estimation process but also empowers industrial systems to dynamically adapt to changing grid conditions.

This advanced capability not only contributes to the seamless optimisation of the entire drive system but also acts as a pre-emptive measure against potential infeed failures. For a discerning mathematics student at Cambridge University, this cutting-edge approach offers a fascinating intersection of mathematical modelling, data analytics, and practical industrial applications, showcasing the transformative potential of ML/AI in mitigating complex challenges within the realm of power systems.

Project Description

Accurate estimation of grid impedance by reduced features

Key Learning Outcomes:
 - Acquired proficiency in industrial drive systems and demonstrated expertise in working with MATLAB/Simulink simulation models.
 - Demonstrated familiarity with Python scripts developed in prior projects.

Project Outcomes: 
 - Attained a comprehensive understanding of the intricate relationship between various features and target variables.
 - Pursued accurate estimation methodologies without compromising system performance.
- Applied ML/AI methodologies to the current Siemens drive systems, focusing on G120C or S120 models.

Work Environment Part of the PEL Innovation Team. Lab / office based team with hybrid remote and onsite working arrangements.
References K. Givaki, S. Seyedzadeh and K. Givaki, "Machine learning based impedance estimation in power system," 8th Renewable Power Generation Conference (RPG 2019), Shanghai, China, 2019, pp. 1-6, doi: 10.1049/cp.2019.0683.
Prerequisite Skills Statistics, Simulation, Data Visualization
Other Skills Used in the Project Predictive Modelling, Power Electronics
Acceptable Programming Languages No Preference

 

Reduction and efficient computation of Neural Networks

Project Title Reduction and efficient computation of Neural Networks
Keywords Neural Network, Algorithms, AI, Optimisation
Project Listed 26 January 2024
Project Status Open
Contact Name Rob Key
Contact Email Robert.key@siemens.com
Company/Lab/Department Siemens Power Electronics Innovation Hub
Address Siemens plc, Varey Road, Congleton, Cheshire, CW12 1PH
Project Duration 8 weeks, full time
Project Open to Undergraduates, Master's (Part III) students
Background Information With the ever-expanding uptake of AI, Siemens is interested in conducting a preliminary study into the background and techniques for optimising the implementation of Neural Networks and making their application more efficient.
Project Description

As a value driven company, Siemens is constantly trying to offer the same functionality using the smallest amount of resource. Whilst trained Deep Neural Networks (DNNs) can be incredibly effective in solving many of the industrial automation / control tasks where our products are often deployed, the size and computational overhead required can become prohibitive.

We would like a student to explore DNN compression techniques which greatly decrease the number of Neurons required to implement a given solution, with an aim to realise a solution on a tightly restricted processor by using the packages provided by the Fraunhofer in the given link.

In addition to the compression, in the same way that ‘Divide and Conquer’ algorithms can drastically reduce the computational cost of the discreet Fourier transform; we would like the student to try to generalise the problem of Neural Network computation in such a way that the number of operations required to implement the inference part of the Neural Network can be reduced.

Work Environment Lab / office based - flexible hybrid
References https://www.hhi.fraunhofer.de/en/departments/ai/research-groups/efficient-deep-learning/research-topics/neural-network-reduction-and-optimization.html
https://en.wikipedia.org/wiki/Divide-and-conquer_algorithm
https://lunalux.io/computational-complexity-of-neural-networks/
Prerequisite Skills Numerical Analysis, Mathematical Analysis, Algebra/Number Theory
Other Skills Used in the Project Simulation, Predictive Modelling
Acceptable Programming Languages No Preference

 

Pricing and hedging short term interest rate options

Project Title Pricing and hedging short term interest rate options
Keywords Interest Rates, Options, Pricing, Implied Volatility Fitting, Fixed Income, Black Scholes, Normal vs LogNormal distributions, Exchange Trading, Over-The-Counter Trading, Relative Value
Project Listed 1 February 2024; Please contact us by mid-February if you are interested
Project Status Open
Contact Name Silvia Stanescu / Yesmine Bahloul
Contact Email Silvia.stanescu@aspectcapital.com; Yesmine.bahloul@aspectcapital.com
Company/Lab/Department Aspect Capital
Address 10 Portman Square, London W1H 6AZ
Project Duration July-September full-time, we are happy to discuss any special requirements
Project Open to Undergraduates, Master's (Part III) students
Background Information

Short-term interest rate (STIR) options are derivative contracts which give the option holder (long party) the right to a payout based on the difference between the underlying rate on expiry of the option and a pre-set level for the rate (i.e. the strike price of the option). The right comes at a price, the premium for the option – which depends on the strike and the time to maturity of the option, inter alia. Equivalently, for a given strike and time to maturity, the option premium can be mapped to an (implied) volatility level, and consequently, to a volatility surface for the strike and time to maturity domains.

Short-term interest rate volatility opportunities increased with the volatility surface shifting regime since late21/early22. For example, taking a benchmark European short-term interest rate (Euribor or more recently Estr) option markets were typically pricing 1 to 2 basis point move per day pre-hiking cycle vs 7bps/day+ territory more recently. In low carry environment, fixed income investors can absorb a relatively small rise in realised volatility before hedging/reducing exposure (which, once triggered, can contribute to further hedging). The new regime makes it particularly interesting to explore those products valuation vs more liquid instruments.

Project Description

The expectations of the project are:
- Fitting volatility surfaces under different assumptions
- Pricing Capfloors and options on futures in a consistent normal volatility framework
- Potentially exploring the relative valuation vs swaptions

No prior knowledge of the instruments is required but a basic understanding (or interest) in the Fixed Income market and products (futures, otc swaps) will definitely help.

We will provide sample data: premium and implied volatility of capfloors and future options + required interest rate curves.

These are the different tasks for the project that we can adapt to the student/context:
- Stripping Euribor/SOFR vol surface: Unlike swaptions where one option price = one implied vol, Euribor/SOFR vol surface need a stripping logic of the liquid instruments (capfloors and future options)
- Producing a cap/floor pricer through quantlib
- Translating the spread to the rest of the surface into forward volatility and/or correlation of short-term interest rates. This can be used to display risk or to study relative valuations (the project can take different directions from here).

Work Environment The student will be mentored by at least one person from the volatility team. The team includes 4 researchers and the student will be encouraged to interact with all of the team as well as other research teams if/when needed.
References Some resources to get familiar with the topic:
- Handbook of volatility models and their applications, Bauwens, Hafner, Laurent
- Pricing and Hedging Interest Rate Options: Evidence from Cap-Floor Markets – Gupta2002
- Valuing Interest Rate Caps and Floors Using QuantLib Python: by Goutham Balaraman
- Interest Rate Options Conventions from the AFMA association
- Interest Rate Caps and Floors Primer - FinPricing
Prerequisite Skills Numerical Analysis, PDEs, Mathematical Analysis, Simulation, Database Queries, Data Visualisation
Other Skills Used in the Project Probability / Markov Chains, Algebra / Number Theory, Predictive Modelling
Acceptable Programming Languages Python, MATLAB

 

Changepoint Detection in Financial Time Series

Project Title Changepoint Detection in Financial Time Series
Keywords Changepoint detection, financial Mathematics, Time series analysis
Project Listed 10 April 2024
Project Status Open
Please apply by 1 May 2024
Contact Name Chris Hunter and Que Vuong-Crouzier
Contact Email qvuong@pharo.com
Company/Lab/Department Pharo Management
Address 8 Lancelot Place, London, SW7 1DR
Project Duration 8 weeks, full time
Project Open to Undergraduates, Master's (Part III) students
Background Information Change point detection (CPD) aims to detect sudden changes in the properties of timeseries data. For example, the level of volatility or the correlation between two time-series. This is important for financial applications, as markets often switch between different regimes, resulting in different behaviour across the traded assets. Many existing tools are built on assumptions of stationarity and are slow to adapt, whereas CPD methods can adapt quickly when the data exhibits change.
Project Description

There are several approaches to CPD, including online vs offline and Bayesian vs Frequentist vs non-parametric. The student should first conduct a literature review of the various methodologies and then investigate one or more specific methods in depth using financial data. Some examples are:

- Online CPD methods for measuring asset volatility and correlation
- Online CPD methods using high-frequency price data
- Offline CPD methods for analysing portfolio manager returns

Work Environment The student will ideally work from Pharo's office on a full-time basis. The project will be done independently with guidance from the team
References https://techrando.com/2019/08/14/a-brief-introduction-to-change-point-detection-using-python/
https://arxiv.org/abs/2003.06222
https://arxiv.org/abs/1906.10372
Prerequisite Skills Statistics
Other Skills Used in the Project Statistics
Acceptable Programming Languages Python