2025 CMP Industrial Projects

2025 Industrial CMP projects

Below you will find the list of industrial CMP projects hosted by external companies (jump to list). Click here to see the list of academic projects hosted by other departments and labs within the university.

New projects will be added throughout Lent Term so check back regularly!

How to Apply

Unless alternative instructions are given in the project listing, to apply for a project you should send your CV to the contact provided along with a covering email which explains why you are interested in the project and why you think you would be a good fit.

Need help preparing a CV or advice on how to write a good covering email?

The Careers Service are there to help! Their CV and applications guides are packed full of top tips and example CVs.

Looking for advice on applying for CMP projects specifically? Check out this advice from CMP Co-Founder and Cambridge Maths Alumnus James Bridgwater.

Remember: it’s better to put the work into making fewer but stronger applications tailored to a specific project than firing off a very generic application for all projects – you won’t stand out with the latter approach!

Please note that to participate in the CMP programme you must be a student in Part IB, Part II, or Part III of the Mathematical Tripos at Cambridge.

Want to know more about a project before you apply?

Come along to the CMP Lunchtime Seminar Series in February 2025 to hear the hosts give a short presentation about their project. There will be an opportunity afterwards for you to chat informally with hosts about their projects.

Alternatively (or as well!), you can reach out to the contact given in the project listing to ask questions.

Industrial CMP Project Proposals for Summer 2025

Amazon Lab126 - Enabling Large Models for Edge AI with Disentanglement and Compositionality
Keywords: Disentanglement, Compositionality, Edge AI, Model Optimization, Resource Efficiency
Vanellus Technologies Ltd - Accelerate CFD convergence with improved field initialisation and mixed precision solves
Keywords: Simulation, Software Engineering, Numerical Analysis, Fluid Dynamics, Physics
APEX Horticulture - Utilising existing cut flower performance and quality data to inform and accelerate decisions for future developments and planting decisions
Keywords: Horticulture, Predictive modelling
Silvaco TCAD - Filtering of the result of Monte Carlo simulation
Keywords: Statistics, Mathematical physics, Numerical Analysis, Monte Carlo simulation, Filtering
Amazon Lab126 - Prisoners Dilemma, LLMs as agents
Keywords: Game theory, LLM agents, Knowledge graphs, Stochastic modelling
AstraZeneca PLC - Hallmarks of cancer regression
Keywords: predictive biomarkers, multimodal data, hierarchical regression, Hallmarks of cancer
Silvaco, Process Engineering Team - Investigation of dopant activation and diffusion in SiC
Keywords: tcad, modeling, activation, diffusion, SiC
Signaloid Ltd - Discrete Representations of Continuous Probability Distributions
Keywords: Distributions, Probability, Representations, Statistics
Silvaco Europe, TCAD - Finite Difference Approximation of Multiphase Stokes Flow with Free Interfaces on Staggered Cartesian Grids
Keywords: Multiphase Stokes Flow, Finite-Difference Methods, PDE, Applied Linear Algebra
Unilever SERS - Exploring the use of Generative Adversarial Networks for synthetic data generation
Keywords: Generative adversarial networks (GANs), Synthetic data, Toxicology, Neural networks, Applied scientific computing
Pharo Management - Bucketed interest rate risk
Keywords: Financial Mathematics, Interest Rates, Risk Management
LifeArc - Machine learning on multimodal and unstructured data for healthcare applications
Keywords: AI, machine learning, healthcare, multimodal data
VET.CT - Veterinary Data Analysis
Keywords: Quantitative & Qualitative Analysis, Statistical Skills, Incomplete Data Handling, Impact Assessment, Structured Reporting

Enabling Large Models for Edge AI with Disentanglement and Compositionality

Project Title	Enabling Large Models for Edge AI with Disentanglement and Compositionality
Keywords	Disentanglement, Compositionality, Edge AI, Model Optimization, Resource Efficiency
Project Listed	8 January 2025
Project Status	Filled
Contact Name	Orange Gao
Contact Email	orangez@amazon.com
Company/Lab/Department	Amazon Lab126
Address	One Station Square, Cambridge, CB1 2GA
Project Duration	8 weeks; full-time
Project Open to	Master's (Part III) students
Background Information	Deploying large AI models on edge devices is a significant challenge due to their limited computational, memory, and energy resources. These constraints often necessitate trade-offs between model performance and efficiency, making it difficult to use cutting-edge AI technologies in applications like IoT, wearables, and real-time systems. This project explores the intersection of disentanglement and compositionality, two promising concepts in AI research, to address these challenges: Disentanglement focuses on isolating meaningful, task-specific features from complex data representations. By enhancing interpretability and generalization, disentanglement makes it possible to optimize models while preserving essential functionality. Compositionality allows models to break down tasks into smaller, reusable components that can be recombined to address a variety of tasks. This modular approach facilitates scalability and adaptability, especially in resource-constrained environments. By leveraging these principles, the project aims to make large AI models lightweight and efficient while retaining strong performance. This approach offers the potential to unlock new applications for AI on edge devices, where real-time performance, adaptability, and energy efficiency are critical.
Project Description	This project involves exploring and developing methods to enable large AI models to operate efficiently on edge devices by leveraging disentanglement and compositionality. The work is open-ended, allowing flexibility to adapt the later stages based on findings from initial experiments. The student will undertake the following key activities: Feature Disentanglement Implement and analyze disentanglement techniques like β-VAE, InfoGAN, and diffusion-based methods to extract meaningful task-specific features from complex datasets. Evaluate these methods using mathematical tools such as latent space analysis, information theory metrics (e.g., KL divergence), and mutual information estimation. Model Pruning and Quantization Use mathematical optimization methods to identify and remove redundant parameters, channels, or layers from large models. Apply quantization techniques, involving numerical precision analysis and statistical error evaluation, to compress model size and reduce computational overhead. Knowledge Distillation Implement teacher-student learning paradigms, leveraging intermediate representations like logits or feature maps. Use statistical and machine learning techniques to evaluate performance transfer and fidelity between the teacher and student models. Compositional Representation Learning Design and train modular, compositional models capable of combining smaller primitives for broader task applicability. Explore combinatorics and graph-based algorithms for representing and evaluating the modular structure of tasks. Edge Deployment Optimization Adapt the optimized model for edge device constraints by integrating hardware-aware design principles such as depthwise convolutions and lightweight layers. Utilize performance profiling to ensure real-time efficiency. Successful Outcome A successful outcome would include: An optimized model capable of efficient and accurate inference on an edge device. A clear understanding of how disentanglement and compositionality enhance model generalization and scalability. Quantitative performance improvements (e.g., reduced model size, lower latency) validated with metrics like computational cost, memory usage, and task accuracy. A well-documented workflow and reproducible codebase. How It’s Interesting/Useful This project combines cutting-edge AI techniques with real-world application in edge computing. The outcomes can be impactful for industries like IoT, wearables, and personalized AI, where resource efficiency is critical. The modular approach ensures the work is extensible, allowing integration into diverse AI tasks. Use of Mathematical Skills Students will actively use mathematical skills in areas such as: Optimization: Minimizing loss functions and pruning redundant parameters. Probability and Statistics: Understanding distributions in disentanglement and analyzing quantization impacts. Linear Algebra: Matrix manipulations for model compression and feature extraction. Information Theory: Metrics like entropy and mutual information for disentanglement and knowledge distillation evaluation. Combinatorics: Designing and analyzing compositional representations and modular task recombinations. By the end of the project, the student will gain experience applying theoretical mathematical concepts to practical problems in AI and edge computing, contributing to a rapidly evolving field.
Work Environment	The student will work independently on this project, with myself serving as the industrial supervisor. I will provide regular guidance and mentorship, helping the student define goals, troubleshoot challenges, and refine their approach throughout the project. Although the student will primarily work on their own, I will be readily available for discussions and feedback through scheduled meetings and as needed via email or video calls. The student will have the flexibility to work remotely, allowing them to structure their schedule to maximize productivity. There are no fixed office or lab hours, but the student is encouraged to maintain consistent progress and attend periodic check-ins to review milestones and ensure alignment with the project goals. Day-to-day, the student will engage in tasks such as implementing and testing machine learning models, analyzing results, and documenting findings. They will have access to tools, datasets, and resources necessary for the project, along with my guidance to navigate technical or conceptual challenges. This setup offers the student a hands-on, immersive experience while fostering independence and problem-solving skills.
References	[1] beta-vae: Learning basic visual concepts with a constrained variational framework. ICLR 2017 [2] Infogan: Interpretable representation learning by information maximizing generative adversarial nets. NeurIPS 2016 [3] Wu, Cindy, et al. "What Mechanisms Does Knowledge Distillation Distill?." Proceedings of UniReps: the First Workshop on Unifying Representations in Neural Models. PMLR, 2024. [4] Chen, H., Zhang, Y., Wang, X., Duan, X., Zhou, Y., & Zhu, W. (2023). Disenbooth: Disentangled parameter-efficient tuning for subject-driven text-to-image generation. ICLR 2024. [5] Challenging common assumptions in the unsupervised learning of disentangled Representations," ICML 2019 [6] Jin, Zeng et al., ” Closed-Loop Unsupervised Representation Disentanglement with β-VAE Distillation and Diffusion Probabilistic Feedback.” In ECCV 2024
Prerequisite Skills	Statistics, Probability/Markov Chains, Image processing, Mathematical Analysis
Other Skills Used in the Project	Numerical Analysis, Mathematical Analysis, Simulation, Predictive Modelling
Acceptable Programming Languages	Python

Accelerate CFD convergence with improved field initialisation and mixed precision solves

Project Title	Accelerate CFD convergence with improved field initialisation and mixed precision solves
Keywords	Simulation, Software Engineering, Numerical Analysis, Fluid Dynamics, Physics
Project Listed	8 January 2025
Project Status	Open
Contact Name	Laurence Cullen
Contact Email	laurence@vanellus.tech
Company/Lab/Department	Vanellus Technologies Ltd
Address	Unit 6, The Courtyard, Sturton Street, Cambridge, CB1 2SN
Project Duration	8 weeks; full-time
Project Open to	Master's (Part III) students
Background Information	In engineering, computational fluid dynamic (CFD) + thermodynamic simulations are an increasingly critical tool for designing performance optimised systems. However increasing design complexity and higher performance requirements means more pressure is put on current simulation tools. For many applications, current simulation tools are too slow, inaccurate or hard to use for effective design optimisation. Vanellus is developing a new GPU-based multiphysics simulation and optimisation engine in order to remove current bottlenecks on simulation usage. At its core, CFD involves solving complex non-linear PDEs using numerical algorithms. Numerically solving non-linear PDEs almost always involves an iterative process, where an initial guess of the solution is gradually improved upon. A high-value research area is coming up with ways to improve your initial guess, so that it is closer to the true solution, therefore requiring fewer iterations to reach convergence. The challenge here is finding the balance between the quality of the initial guess and the amount of computing resources it takes to find it.
Project Description	As a small startup, we have a range of mathematical tasks to tackle with a flexible R&D roadmap, so it’s worth noting that depending on your research interests and our technical progress, we are open to adapting the project to suit our mutual needs. What we would like you to do: Review background literature and required reading to get you up to speed on non-linear PDE solving. In discussion with the team, decide on a promising and tractable research direction to focus on during the placement. Identify required modifications to our code to test the approach. Run experiments to evaluate the efficacy. Some ideas for research directions we have found: Using statistical/ML methods to directly deduce plausible initial guesses given simulation geometries and boundary conditions. Running an initial cheap simulation in lower floating point precision, and then “upscaling” the result to high precision. Successful outcome: Improved non-linear PDE initialisation is a large open field, so we do not expect the problem to be fully solved during the time of the placement. Success for us will be if, at the project conclusion, we can have some preliminary results that either show the potential or disprove the usefulness of a particular numerical method. How would it be interesting/useful? This project will allow you to use your mathematical expertise in the context of an R&D-focused software engineering startup. In addition to improving your knowledge and skills in numerical analysis, we expect you to learn the basics of software engineering in a team, including using version control and software engineering principles such as unit testing. Our hope is you would leave this placement in a great position either to continue with academic research or to pursue a career demanding software skills. For those who are interested, we predominantly program in Python, specifically using the JAX accelerated numerical computing library. We also sometimes make use of lower-level languages such as CUDA and Rust.
Work Environment	You will work in the office as part of our team including the company founders. We are based in Cambridge (10 min walk from the train station). We typically do 9-5:30 working hours. We are a heavily collaborative team, so we’re sharing ideas and knowledge throughout the day. We start every day with a 15-minute meeting where we all share what we’ll be working on for that day and if we need any help. As well as your own project, we would love to get your insight on our day-to-day R&D mathematical problems at the whiteboard. We have a strong emphasis on peer learning, and all of our code goes through review the team, where we share ideas on how to improve code quality and structure.
References	Fluid Mechanics 101 YouTube Channel: https://www.youtube.com/playlist?list=PLnJ8lIgfDbkoZ33CHr-p6z2CBkp9OTcWj This is an excellent channel for learning the fundamentals of CFD from a mathematical perspective, and this playlist is a good place to start. Notes on CFD from the developers on OpenFOAM: https://doc.cfd.direct/notes/cfd-general-principles/ This is an online textbook written by the developers of OpenFOAM (one of the most popular open-source CFD codes), which gives an excellent overview on some of the key algorithms behind CFD. Numerical Linear Algebra by Trefethen & Bau: https://www.stat.uchicago.edu/~lekheng/courses/309/books/Trefethen-Bau.pdf This is a more in-depth textbook for learning numerical linear algebra, which should be helpful in learning the fundamentals of iterative schemes. CFDNet: A deep learning-based accelerator for fluid simulations, Obiols-Sales et al. https://arxiv.org/pdf/2005.04485 This is an interesting paper where deep neural network methods were used to find improved initial guesses for Reynolds Averaged Navier Stokes (RANS) simulations. On floating point precision in computational fluid dynamics using OpenFOAM, Brogi et al. https://www.sciencedirect.com/science/article/pii/S0167739X23003813 This paper experimented with using different floating point precision for reference PDE solving problems.
Prerequisite Skills	Fluids, Numerical Analysis, PDEs, Simulation
Other Skills Used in the Project	Statistics, Mathematical physics, Predictive Modelling, Data Visualization, App Building
Acceptable Programming Languages	Python, MATLAB, C++, Rust, CUDA, C

Utilising existing cut flower performance and quality data to inform and accelerate decisions for future developments and planting decisions

Project Title	Utilising existing cut flower performance and quality data to inform and accelerate decisions for future developments and planting decisions
Keywords	Horticulture, Predictive modelling
Project Listed	8 January 2025
Project Status	Filled
Contact Name	Richard Boyle
Contact Email	richard.boyle@apexhorticulture.com
Company/Lab/Department	APEX Horticulture
Address	Pierson Road, The Enterprise Campus, Alconbury Weald, PE284YA
Project Duration	8 weeks
Project Open to	Undergraduates, Master's (Part III) students
Background Information	APEX Horticulture Ltd. is a professional research and development business, offering bespoke services for cut flowers and plants. APEX is based in a purpose-built testing centre, situated in Alconbury, Cambridgeshire (UK). APEX is part of the wider MM Flowers group, where MM is one of the UK’s leading cut flower importer/processing companies, with a unique ownership model and innovative practices. MM Flowers is owned by the AM Fresh Group, a leading breeder, grower and distributor of citrus and grapes; Vegpro, East Africa’s largest flower and vegetable producer; and Elite, the leading flower grower and breeder in South America. APEX is at the optimal position in the chain, able to deliver high quality, independent research and close-to-market proximity matched with the invaluable insight into the true performance of flowers and plants subjected to actual supply chain conditions. The infrastructure and specialised personnel of APEX aims to deliver robust, standardised and consistent research every week of the year, together with the ability to undertake large scale projects to match all client requirements, influencing all elements of the cut flower supply chain. APEX undertakes many different research projects covering the entire supply chain, from development of new flower types through to the manufacturing requirements for the final bouquets. Each of these projects generates a significant amount of data and insight, which is used to provide recommendations to the various stakeholders of each project.
Project Description	APEX tests up to 50k cut flower samples annually, with around 30-60 data points generated per sample. This data includes agronomic and freight data, through to performance data associated with flower longevity and aesthetic appeal. Many of the samples tested are part of long-term programmes focussed on understanding the performance of different cultivars and farms across seasons and years. Alongside this, prospective new cultivars are tested to understand if there are alternative and ‘better’ options available to the current selection. The process to develop and introduce new cultivars is inefficient however, taking anywhere up to 10 years. This is heavily reliant on intuition of breeders, and it can be a challenge to successfully introduce new cultivars that meet rapidly changing requirements. For example, whilst many of the cut flowers grown on the equator are transported to Europe by air freight, the entire industry is currently evaluating the possibility of transitioning much of this to sea freight. Whilst this presents many benefits including environmentally and availability, it substantially increases the freight time, which many existing flower types and cultivars are not able to withstand. During the development process, the breeders and growers are presented with a dilemma, where there is a desire to be informed and led by data (such as from APEX), but this is a slow process due to limited numbers of samples available initially. Accelerated data generation would require significantly more plants and thus samples, which requires various resources (time, space and inputs), but at greater risk if the cultivars prove to be unviable - an abundance of data is available however where new cultivars have been introduced, with varying levels of success. This has obvious implications for the breeder/grower, but also for those along the supply chain, including suppliers and retailers. Flower types and cultivars selected that do not meet the required standards can result in significant waste, consumer dissatisfaction and potentially brand damage. As such, there is a desire to try and improve the efficiency/speed of the flower development process whilst either minimising/understanding the associated risks. Given the above, there are different areas that APEX, MM Flowers and the wider group would like to explore, including – Can historical data available be utilised to create models to predict the likelihood of success of cultivars currently being developed as part of breeding programmes? Where data is available when new flower types and cultivars have been transported by sea/subject to sea freight simulations, what impact does that have on determine the viability for successfully commercial application?
Work Environment	Student led project, supported by wider team. Hybrid working.

Filtering of the result of Monte Carlo simulation

Project Title	Filtering of the result of Monte Carlo simulation
Keywords	Statistics, Mathematical physics, Numerical Analysis, Monte Carlo simulation, Filtering
Project Listed	15 January 2025
Project Status	Open
Contact Name	Artem Babayan
Contact Email	artem.babayan@silvaco.com
Company/Lab/Department	Silvaco TCAD
Address	SIlvaco Europe Ltd. 5, Compass Point, St Ives, PE27 5JL
Project Duration	8-12 weeks full time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Mathematical modelling of real-life physical problems
Project Description	Silvaco is the software engineering company developing the tools to assist in manufacturing of semiconductor devices. In UK office we mostly work on 'process simulation' side -- mathematical modelling of the processes used in manufacturing. One of such processes is implantation -- bombardment of piece of (typically) Si with ions (dopants), to change the electrical properties of the target in specific areas. To predict the final ion distribution we use Monte Carlo simulation -- follow the path of large number of ions, as they fly through the structure. The final results show artefacts, typical for Monte Carlo simulation -- e.g. single stray particles or missed areas ('hot' and 'cold' spots correspondingly). We need to apply filter to these 'raw' results, to improve the overall quality. Your task would be to review the literature and to suggest and to implement the required algorithms.
Work Environment	The project assumes the high degree of independence. The development part is expected to be done in the office (in St Ives, near Cambridge).
References
Prerequisite Skills
Other Skills Used in the Project	Statistics, Mathematical physics, Numerical Analysis, Simulation
Acceptable Programming Languages	Python, MATLAB, C++

Prisoners Dilemma, LLMs as agents

Project Title	Prisoners Dilemma, LLMs as agents
Keywords	Game theory, LLM agents, Knowledge graphs, Stochastic modelling
Project Listed	15 January 2025
Project Status	Open
Contact Name	Uday Kiran
Contact Email	kirannu@amazon.com
Company/Lab/Department	Amazon Lab126
Address	One Station Square, Cambridge, CB1 2GA
Project Duration	8 weeks; full-time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	The prisoner's dilemma (PD) is a game theory paradox that illustrates how two rational individuals acting in their own self-interest can lead to a suboptimal outcome for the group. It's a thought experiment where each individual can choose to cooperate with their partner for mutual benefit or betray them for personal gain. The dilemma arises because while it's rational for each individual to defect, cooperation would result in a higher payoff for both. This project tries to model rational individuals with per-biased LLM agents to make problem more realistic to real world.
Project Description	The multiplayer prisoner's dilemma The multiplayer prisoner's dilemma, also known as the n-person prisoner's dilemma (NPD), is a game theory scenario where multiple players must choose between cooperating or defecting: Cooperation: Players work together for the common good Defection: Players pursue their own short-term interests The outcome for each player depends on their choice and the choices of all other players. If everyone chooses to defect, the outcome is worse for everyone than if they had cooperated. The NPD became popular in the 1970s among economists and social theorists. It can be used to model many real-world social, political, and economic problems. For example, the tragedy of the commons is a multiplayer generalization of the prisoner's dilemma. In this scenario, villagers must choose between personal gain or restraint. If they all choose to defect, the commons are destroyed. PD problem for LLMs and knowledge graphs: This section outlines approach to redesigning the Prisoner's Dilemma (PD) problem using LLM agents with personalized characteristics. Here's a high-level outline of how you could implement this simulation: 1. LLM-based Agents: Utilize pre-trained large language models (e.g., GPT-3, BERT) as the foundation for each simulated individual. Fine-tune or condition the LLMs with specific personality traits, using available metadata and knowledge graphs to capture individual characteristics. 2. Knowledge Graph Representation: Construct a knowledge graph that encodes the metadata and relationships for each simulated individual, such as purchase history, search behavior, demographic information, and other relevant attributes. Leverage the knowledge graph to inform the decision-making and behavior of the LLM agents during the Prisoner's Dilemma game. 3. Personality Trait Assignment: Assign generic personality traits (e.g., greedy, mischievous, cooperative, humble, merciful) to the LLM agents based on the information in the knowledge graph. Ensure that the personality traits are reflected in the agents' decision-making and interactions during the Prisoner's Dilemma game. 4. Prisoner's Dilemma Simulation: Create a simulated group of 1,000 to 10,000 LLM agents and have them participate in the Prisoner's Dilemma game. Implement the game mechanics, where each agent must choose between cooperating or defecting, based on their personalized characteristics and the game's payoff structure. 5. Multi-group Simulation: Extend the simulation to include multiple groups of LLM agents, representing different demographics or nations. Facilitate both intra-group and inter-group Prisoner's Dilemma interactions, allowing for the exploration of social behavior responses across different demographics or nations. Key Considerations: Ensure the LLM agents' decision-making and behavior are realistic and aligned with the assigned personality traits and knowledge graph information. Explore techniques to fine-tune or condition the LLMs to exhibit the desired personality traits and decision-making processes. Carefully design the knowledge graph structure and the mapping between metadata and personality traits to achieve accurate individual and group-level behavior. Implement robust simulation mechanics and data collection to analyze the emergent behavior and outcomes of the Prisoner's Dilemma game across the different groups and scenarios. Problems to solve for successful completion of the project: Competitive pricing in the marketplace: 1. Modeling the Marketplace Dynamics: Use multi-agent systems to simulate the interactions between different sellers, each represented by an AI agent. Leverage knowledge graphs to capture the relationships between sellers, their products, pricing, branding, and marketing strategies. Employ stochastic modeling techniques, such as Markov Decision Processes, to capture the uncertainty and dynamic nature of the marketplace. 2. Modeling Seller Behavior: Develop AI agents that can learn and adapt their pricing, branding, and marketing strategies based on the actions of their competitors and the responses of buyers. Utilize reinforcement learning or other machine learning techniques to allow the agents to learn and optimize their strategies over time. Incorporate game-theoretic principles to model the strategic decision-making of the sellers, taking into account the potential responses of their competitors. Incorporate in the knowledge graphs human personality traits (e.g. greedy, mischievous, cooperative, humble, merciful) 3. Modeling Buyer Behavior: Integrate buyer preferences, price sensitivity, and decision-making processes into the model. Leverage techniques like discrete choice modeling or agent-based modeling to capture the heterogeneity and complexity of buyer behavior. Explore how buyer actions and preferences influence the pricing and marketing decisions of the sellers. Incorporate in the knowledge graphs human personality traits (e.g. greedy, mischievous, cooperative, humble, merciful) 4. Incorporating the Tragedy of the Commons: Explore how the shared nature of the marketplace, similar to the tragedy of the commons, can lead to suboptimal outcomes for the sellers & buyers. Investigate strategies or mechanisms that can mitigate the tragedy of the commons, such as coordination, cooperation, or regulation. Analyze how the presence of the tragedy of the commons affects the pricing, branding, and marketing decisions of the sellers. 5. Leveraging Multi-LLM Agents: Utilize multiple large language models (LLMs) to represent different aspects of the marketplace, such as seller decision-making, buyer behavior, and market dynamics. Explore techniques for integrating and coordinating the different LLM agents to create a cohesive and realistic simulation of the marketplace. Investigate how the combination of knowledge graphs, stochastic modeling, and multi-LLM agents can provide a more comprehensive and accurate representation of the competitive pricing landscape. By incorporating these advanced techniques, students can gain a deeper understanding of the complex dynamics and decision-making processes involved in competitive pricing within a marketplace. This exercise can help them develop skills in multi-agent modeling, stochastic optimization, and the application of knowledge graphs and LLMs to complex real-world problems
Work Environment	The student will work independently on this project, with myself serving as the industrial supervisor. I will provide regular guidance and mentorship, helping the student define goals, troubleshoot challenges, and refine their approach throughout the project. Although the student will primarily work on their own, I will be readily available for discussions and feedback through scheduled meetings and as needed via email or video calls. The student will have the flexibility to work remotely, allowing them to structure their schedule to maximize productivity. There are no fixed office or lab hours, but the student is encouraged to maintain consistent progress and attend periodic check-ins to review milestones and ensure alignment with the project goals. Day-to-day, the student will engage in tasks such as implementing and testing models, analyzing results, and documenting findings. They will have access to tools, datasets, and resources necessary for the project, along with my guidance to navigate technical or conceptual challenges. This setup offers the student a hands-on, immersive experience while fostering independence and problem-solving skills.
References
Prerequisite Skills	Statistics, Probability/Markov Chains, Mathematical Analysis
Other Skills Used in the Project	Predictive Modelling, Database Queries, Data Visualization
Acceptable Programming Languages	Python, R

Hallmarks of cancer regression

Project Title	Hallmarks of cancer regression
Keywords	predictive biomarkers, multimodal data, hierarchical regression, Hallmarks of cancer
Project Listed	15 January 2025
Project Status	Filled
Contact Name	Fabio Rigat
Contact Email	fabio.rigat@astrazeneca.com
Company/Lab/Department	AstraZeneca PLC
Address	36 Hills Road, Cambridge, CB2 8PA
Project Duration	8 weeks full time, ideally btw June 2nd and July 31st
Project Open to	Undergraduates, Master's (Part III) students
Background Information	In oncology, molecular features of prognostic or predictive value are key to matching patients with effective investigational treatment strategies. These features range from a small number of well understood markers, including expression of drug targets and molecular characterisations of the tumour microenvironment, up to high dimensional multi-modal data including genetic variants, gene expression and protein expression. When high dimensional molecular disease features are used, it is challenging to derive robust features providing accurate prediction of response to therapy in validation samples.
Project Description	This project will focus in on assessment of a novel supervised learning methodology estimating low dimensional predictive markers by combining high dimensional disease molecular characteristics and established gene annotation systems based on the Hallmarks of Cancer. This assessment will include running computer simulations estimating true & false positive outcome probabilities under selected scenarios, application of the method to re-analysis of published datasets and applications to the method to exploratory analyses of internal unpublished data. The outcome of this project will be integrated with ongoing work to provide material towards a publication.
Work Environment	Student will be working within the AstraZeneca Biometrics environment, supported specifically by members of the Statistical Innovations organisation.
References	1. Douglas Hanahan and Robert A. Weinberg (2011) Hallmarks of Cancer: The Next Generation, Cell, DOI 10.1016/j.cell.2011.02.013 2. Ádám Nagy, Gyöngyi Munkácsy, Balázs Győrffy (2021) Pancancer survival analysis of cancer hallmark genes, https://doi.org/10.1038/s41598-021-84787-5 3. Otília Menyhart, William Jayasekara Kothalawala, Balázs Győrffy (2024) A gene set enrichment analysis for the cancer hallmarks, https://doi.org/10.1016/j.jpha.2024.101065 4. Francesco C Stingo, Yian A Chen, Mahlet G Tadesse, Marina Vannucci (2011) Incorporating biological information into linear models: a bayesian approach to the selection of pathways and genes, https://doi.org/10.1214/11-AOAS463
Prerequisite Skills	Statistics, Simulation, Predictive Modelling, Data Visualization, Effective collaboration skills
Other Skills Used in the Project	Interest in oncology
Acceptable Programming Languages	Python, MATLAB, R

Investigation of dopant activation and diffusion in SiC

Project Title	Investigation of dopant activation and diffusion in SiC
Keywords	tcad, modeling, activation, diffusion, SiC
Project Listed	20 January 2025
Project Status	Open
Contact Name	Alexandros Kyrtsos
Contact Email	alexandros.kyrtsos@silvaco.com
Company/Lab/Department	Silvaco, Process Engineering Team
Address	Silvaco Europe Ltd. 5, Compass Point, St Ives, PE27 5JL
Project Duration	8 weeks, full time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Silvaco is a global leader in electronic design automation (EDA) software and technology computer-aided design (TCAD) solutions. Our cutting-edge tools empower semiconductor companies to design, simulate, and optimize next-generation devices and processes.
Project Description	As a TCAD intern focusing on process simulation, you’ll work alongside experts to investigate dopant activation and diffusion in SiC-4H, developing and enhancing models for these critical semiconductor processes. This is an opportunity to gain hands-on experience, contribute to advanced research, and be part of the innovation that drives the future of semiconductor technology. The project involves literature search and research on the matter of activation and diffusion of various dopants in SiC-4H. Furthermore, it involves the development and validation of models to simulate these processes. The student will have the opportunity to enhance and develop skills such as data analysis and visualization, development of physical models, simulation techniques, programming.
Work Environment	Hybrid (mix of on-site and remote work). High degree of independent work is required.
References	https://www.iue.tuwien.ac.at/phd/simonka/index.html, chapter 3
Prerequisite Skills	Mathematical physics, Simulation, Data Visualization
Other Skills Used in the Project	Simulation, Predictive Modelling, Data Visualization
Acceptable Programming Languages	Python, MATLAB, C++

Discrete Representations of Continuous Probability Distributions

Project Title	Discrete Representations of Continuous Probability Distributions
Keywords	Distributions, Probability, Representations, Statistics
Project Listed	20 January 2025
Project Status	Open
Contact Name	Laurence Weir
Contact Email	careers@signaloid.com
Company/Lab/Department	Signaloid Ltd
Address	4 Station Square, Cambridge, CB1 2GE
Project Duration	8 weeks, full time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Probability distributions provide a mathematical framework for understanding and modelling uncertainty, allowing us to quantify the likelihood of different outcomes in random processes. By characterising how data is distributed, they enable informed decision-making and are foundational to fields like statistics, machine learning, and risk assessment. Many of these distributions, such as the famous normal distribution (bell curve), are defined continuously, but in reality we need to represent these distributions with a finite number of discrete points so that we may perform statistical tasks quickly and efficiently on a computer.
Project Description	In this project you will be working on new discrete representations of probability distributions to try and uncover better ways to capture the shape and form of many theoretical and real world distributions. First you will learn about distributions as a rigorous mathematical object and how you can perform arithmetic on them. You will also learn how we quantify the "closeness" of distributions using distance metrics and criteria. Then after researching existing methods to represent distributions discretely, you will get to try and conceive of new and improved methods. Finally, you will test and verify these methods both analytically and numerically through simulations (in Python or a similar language).
Work Environment	Join a remote team of industry mathematicians discussing probability theory and real world statistical problems. You will have the chance to talk with your supervisor multiple times per week and have them guide you through the project and oversee your progress.
References	https://link.springer.com/article/10.1007/s00362-022-01356-2
Prerequisite Skills	Statistics, Probability/Markov Chains, Simulation
Other Skills Used in the Project	Mathematical Analysis, Data Visualization, Metric Spaces
Acceptable Programming Languages	Python

Finite Difference Approximation of Multiphase Stokes Flow with Free Interfaces on Staggered Cartesian Grids

Project Title	Finite Difference Approximation of Multiphase Stokes Flow with Free Interfaces on Staggered Cartesian Grids
Keywords	Multiphase Stokes Flow, Finite-Difference Methods, PDE, Applied Linear Algebra
Project Listed	24 January 2025
Project Status	Open
Contact Name	Vasily Suvorov
Contact Email	vasily.suvorov@silvaco.com
Company/Lab/Department	Silvaco Europe, TCAD
Address	Silvaco Technology Centre Compass Point St Ives, Cambridgeshire, United Kingdom PE27 5JL
Project Duration	8 weeks, 40 hours/week
Project Open to	Undergraduates, Master's (Part III) students
Background Information	A modern semiconductor technology involves processes where materials with free interfaces undergo large and slow deformations. Such deformations can often be modelled by the incompressible Stokes flow. The project aims to analyse the company’s working numerical approach to model such flow with the aim of improving accuracy, stability and convergence.
Project Description	Silvaco uses finite-difference schemes on structured 2D and 3D Cartesian grids to simulate multiphase Stokes flow with free interfaces. A particular challenge in applying such schemes lies in the accurate approximation of boundary conditions at the interfaces between two different viscous liquids and in the approximation of momentum equations across these interfaces. The student will assist in analyzing and improving the approximation and stability of the current numerical schemes, with the possibility of proposing better alternatives. Special attention will be given to developing numerical schemes that are well-suited for iterative methods such as BICGSTAB. The resulting matrices will be analysed using SVD and QR factorization, and other appropriate techniques from Numerical Linear Algebra.
Work Environment	The student will work on his/her own with the support and guidance from the supervisor.
References
Prerequisite Skills	Numerical Analysis, PDEs, Algebra/Number Theory
Other Skills Used in the Project
Acceptable Programming Languages	Python, MATLAB

Exploring the use of Generative Adversarial Networks for synthetic data generation

Project Title	Exploring the use of Generative Adversarial Networks for synthetic data generation
Keywords	Generative adversarial networks (GANs), Synthetic data, Toxicology, Neural networks, Applied scientific computing
Project Listed	27 Jan 2025
Project Status	Filled
Contact Name	Patrik Engi
Contact Email	patrik.engi@unilever.com
Company/Lab/Department	Unilever SERS
Address	Colworth Science Park, Sharnbrook, Bedford MK44 1LQ
Project Duration	8-12 weeks
Project Open to	Undergraduates, Master's (Part III) students
Background Information	In fast-moving consumer goods, it is vital that safety risk assessments are conducted to ensure products are safe for humans and the environment. Historically, these risk assessments have relied on the use of in vivo animal testing to identify detrimental impacts of chemicals on organisms. However, from a scientific, ethical, and legislative perspective, more recently developed non-animal methods are the preferred approach. For more than 20 years, Unilever’s Safety, Environmental and Regulatory Science (SERS) has been developing novel in silico and in vitro based methods, which leverage recent advances in biology, genetics, computing, mathematics and statistics, to conduct safety assessments without the use of animal testing. [1, 2]. This evolution in the risk assessment paradigm presents new opportunities in terms of applying new deep learning and AI-based approaches. A key risk assessment step is to characterise the potential effects that a chemical may have on different cell types. This typically involves using high throughput transcriptomics (HTTr) to measure the genetic response of cells to different concentrations of a test chemical. Such data can be expensive to generate, particularly if it needs to be generated for multiple chemicals and cell types. Furthermore, it is common to encounter situations where not all the necessary data for a risk assessment is readily available. Therefore, the use of approaches that maximise the utility of the available data is a high priority. Recent advances in deep-learning and AI may provide a way to generate so-called synthetic data. These could be used either to fill data gaps within a risk assessment or make predictions on the effects a chemical might cause at a gene transcriptional level. This project would focus on exploring the utility presented by Generative Adversarial Networks in this application.
Project Description	GANs generate synthetic data through two competing neural networks, a generator and discriminator, engaging in a zero-sum game. This project will therefore provide the opportunity to apply and expand existing knowledge in various areas, such as statistics, probabilistic machine learning, and game theory, while also building skills and experience in applied scientific computing. We suggest that the student(s) approach the topic as an open-ended research project, focusing on recent developments using GANs in in vitro Toxicology [3]. We would like the student to demonstrate and develop from the existing science by: Reviewing the current literature landscape, highlighting relevant papers, tools, and resources. Developing their technical knowledge of deep learning, statistics and scientific computing. Identifying, implementing and test various tools that already exist in this space. This phase of the project would involve the student familiarizing themselves with the current state-of-the-art regarding the application of GANs in toxicology, guided by the available literature as in Refs. [3-5]. Once achieved, we would like the student to advance this field of application by: Generating synthetic data beyond the examples found in the literature. Identifying suitable tools and making the necessary adaptions to generate data for in-house studies. Throughout the project, the student will have opportunity to meet with experts from various mathematical backgrounds, as well as collaborate with other disciplines such as toxicology, human biology, and risk assessment.
Work Environment	Student will mostly work independently, but will have the full support of a wider team + supervisors for questions, guidance and advice. We expect the student will be working remotely, but visiting/attendance to site (Sharnbrook, near Bedford) is encouraged if travel permits.
References	[1] J. Reynolds, S. Malcomber and A. White, “A Bayesian approach for inferring global points of departure from transcriptomics data,” Computational Toxicology, vol. 16, p. 100138, November 2020. [2] T. E. Moxon, H. Li, M.-Y. Lee, P. Piechota, B. Nicol, J. Pickles, R. Pendlington, I. Sorrell and M. T. Baltazar, “Application of physiologically based kinetic (PBK) modelling in the next generation risk assessment of dermally applied consumer products,” Toxicology in Vitro, vol. 63, p. 104746, March 2020. [3] Chen X, Roberts R, Tong W, Liu Z. Tox-GAN: An Artificial Intelligence Approach Alternative to Animal Studies-A Case Study With Toxicogenomics. Toxicol Sci. 2022 Mar 28;186(2):242-259. doi: 10.1093/toxsci/kfab157. PMID: 34971401. [4] Chen, X., Roberts, R., Liu, Z. et al. A generative adversarial network model alternative to animal studies for clinical pathology assessment. Nat Commun 14, 7141 (2023). https://doi.org/10.1038/s41467-023-42933-9 [5] Lee, M. Recent Advances in Generative Adversarial Networks for Gene Expression Data: A Comprehensive Review. Mathematics 2023, 11, 3055. https://doi.org/10.3390/math11143055
Prerequisite Skills	Statistics
Other Skills Used in the Project	Predictive Modelling, Data Visualization, Deep learning
Acceptable Programming Languages	No Preference

Bucketed interest rate risk

Project Title	Bucketed interest rate risk
Keywords	Financial Mathematics, Interest Rates, Risk Management
Project Listed	30 January 2025
Project Status	Closed
Contact Name	Chris Hunter, Jennifer Shaeffer
Contact Email	jshaeffer@pharo.com
Company/Lab/Department	Pharo Management
Address	8 Lancelot Place, London, SW7 1DR
Project Duration	8 weeks, full time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	Pharo Management is a leading global macro hedge fund manager with a focus on emerging markets. Founded in 2000, the firm has offices in London, New York and Hong Kong and currently manages approximately $7 billion in assets across four funds. Pharo trades foreign exchange, sovereign and corporate credit, local market interest rates, commodities, and their derivatives. We trade in over 70 countries across Asia, Central and Eastern Europe, the Middle East and Africa, Latin America as well as developed markets. Our investment approach combines macroeconomic fundamental research and quantitative analysis. Pharo employs a diverse, dynamic team of over 125 professionals representing over 20 nationalities and 30 languages. We have a strong corporate culture anchored in core values such as collaborative spirit, creativity, and respect. We are passionate about what we do and are committed to attracting the best and brightest talent.
Project Description	Expected Outcomes By the end of the internship, the intern will have developed a clear understanding of risk transformations in interest rate modeling, implemented practical computation methods for Jacobian-based transformations, and potentially explored advanced techniques using algorithmic differentiation. The project will contribute to more efficient and accurate risk management methodologies in fixed-income markets. Project Overview This internship project focuses on the transformation of bucketed interest rate risk using Jacobian matrices and, if time permits, the computation of bucketed risk using algorithmic differentiation (AD). The goal is to enhance methodologies for understanding and managing interest rate sensitivities in financial models. Interest rate risk is commonly analyzed by measuring sensitivities to shifts in specific maturity buckets (e.g., 1Y, 5Y, 10Y). However, for risk aggregation, stress testing, and hedging, these bucketed sensitivities must often be transformed into different risk bases, such as principal component decompositions or forward-rate perturbations (for example, 1Y1Y, 2Y3Y and 5Y5Y). This transformation is mathematically represented as a Jacobian matrix operation, which maps one set of risk factors to another while preserving sensitivity structure. Algorithmic Differentiation (AD) is a computational technique used to efficiently compute derivatives of functions expressed as computer programs. Unlike symbolic differentiation, which can lead to expression swell, or numerical differentiation, which suffers from truncation and rounding errors, AD systematically applies the chain rule of differentiation at the elementary operation level, allowing for highly accurate and efficient gradient computations. AD is particularly useful in financial applications such as risk management and derivatives pricing, where sensitivity analysis and risk calculations must be performed quickly and accurately. Key Objectives 1. Jacobian Risk Transformations Understand and implement transformations between different risk factor bases. Construct and validate the Jacobian matrix that links different interest rate risk representations. Analyze stability and efficiency of transformations in practical applications. 2. Algorithmic Differentiation for Risk Computation If time permits, explore the use of algorithmic differentiation (AD) to compute bucketed interest rate risk. Compare AD-based risk computation with traditional finite difference methods in terms of accuracy and performance. Skills & Technologies Mathematics & Finance: Linear algebra, calculus (Jacobian matrices), financial risk modeling. Programming: Python (NumPy, SciPy), potential exposure to AD tools such as JAX or TAPENADE. Computational Techniques: Matrix transformations, differentiation techniques, numerical stability considerations.
Work Environment	Ideally the student will work at the Pharo office (central London), supported by myself and other members of the Quantitative Analytics team. Remote working would be considered.
References	For an introduction to risk transformations using Jacobian matrices, refer to: Darbyshire, J. (2017) – Pricing and Trading Interest Rate Derivatives. For an introduction to algorithmic differentiation, refer to: Burgess, N. – Algorithmic Adjoint Differentiation (AAD) for Swap Pricing and DV01 Risk.
Prerequisite Skills	Python, Linear Algebra
Acceptable Programming Languages	Python

Machine learning on multimodal and unstructured data for healthcare applications

Project Title	Machine learning on multimodal and unstructured data for healthcare applications
Keywords	AI, machine learning, healthcare, multimodal data
Project Listed	4 February 2025
Project Status	Open
Contact Name	Sam Genway
Contact Email	sam.genway@lifearc.org
Company/Lab/Department	LifeArc
Address	Accelerator Building, Open Innovation Campus, Stevenage, SG1 2FX
Project Duration	9 weeks, full-time - starting 30 June 2025
Project Open to	Undergraduates, Master's (Part III) students
Background Information	At LifeArc, our ambition is to make life science life changing. We do this by advancing scientific discoveries beyond the lab, faster, so that they can shape the next generation of diagnostics, treatments, and cures. Working at the cutting edge of translational science and as the early-stage translation specialists, we progress scientific discoveries on their journey to becoming a medicine, diagnostic or intervention that can improve patients’ lives. Our work begins by seeking out innovative science, then helping to develop this to a point where there is a clinical and commercial pathway for others to invest the time and money to take it further forward. Data Sciences is an integral part of LifeArc’s Science organisation. We work with our laboratory-based projects to analyse, visualise and interpret data in order to design future experiments; we build computational models to make predictions, often using the latest AI and machine learning methods; we develop computational workflows and write software; we work closely with LifeArc colleagues and with external collaborators in multiple project teams. Our methods are applied to tackle problems in chemistry, biology and medical/clinical science. What we can offer you: Because we understand everyone has different requirements, our flexible benefits allow you to choose those which are important to you. Our pension scheme offers employer contributions of up to 12%, private health insurance, and annual leave of 31 days PLUS bank holidays (prorated for duration of placement). Join us, and you’ll have the scope to be creative and take measured risks. You’ll be rewarded for your curiosity, for working as one team, and for learning fast. And you’ll have everything you need to be your best every day. We all have potential. At LifeArc, you’ll discover what you can really do with it.
Project Description	Job Title: Summer Placement Student (Data Sciences) Location: Stevenage Job Type: Temporary (9 weeks) full-time Placement Start Date: 30 June 2025 Salary: £22,575 per annum (£21,500 base, plus £1,075 allowance, which can be taken as cash, or used for additional benefits) – prorated for duration of summer placement At LifeArc, we want to hear from people who are as passionate as we are about making life science life changing that can improve patients’ lives. A bit about the role: This is an opportunity to get involved in exciting work within the Data Sciences team at our state-of-the-art facilities in Stevenage. LifeArc is a self-funded not-for-profit organisation with a mission to impact patient unmet needs. Artificial Intelligence (AI) brings new paths to patient impact through the development and translation of healthcare AI applications. However, several challenges exist in the application of machine learning methods to real-world challenges, such as predicting patients at risk of disease, or informing a diagnosis or prognosis. Particular challenges include leveraging multimodal data available in patient cohorts during training to create impactful models which provide utility when modalities are not available at inference time. Another challenge is in the formulation of machine learning methods for time-to-event predictions leveraging unstructured datasets. In each case, there are multiple approaches which could be valuable, and the aim of this project is to compare and identify those with real-world utility. Project The project will focus on one or two of the following challenges: Multimodal machine learning with modalities available exclusively during training. This has broad relevance across applications where data modalities are available in clinical study data with the potential to create better representations and models even when these modalities are not available at inference time (for example, in a healthcare setting). Examples include imaging data alongside patient histories, biomarkers and demographic data. A wide variety of methods exist to tackle this challenge including imputation, joint representation learning, and model distillation. The aim of this project is to explore which approach(es) would be of utility in realistic real-world scenarios. Machine learning time-to-event predictions from unstructured data. This is relevant for identifying patients at risk of disease or clinical events of interest using unstructured data such as images or text. A number of approaches have been developed, ranging from formulating appropriate loss functions to neural ODEs. The goal here is to explore each within the context of real-world applications and identify which are of utility. About you: Education & experience required: You will currently be in your 2nd, 3rd or 4th year and studying for your first degree and available to commence a summer placement from 16 June 2025. You will be studying a Mathematics degree. Skills and Strengths, we are looking for during the recruitment and selection process: On track to achieve at least a 2:1 classification Collaboration Drive and determination Analytical mindset Accountability Adaptability Desire to learn Effective communication skills You are not expected to have a deep background in life sciences or healthcare. We want to hear from students who are passionate about the application of machine learning methods for real world impact in healthcare, with experience in predictive modelling and hands-on programming experience in python. Candidates must have the right to work in the UK. Application Process: Applications are open from 5 February 2025. As part of the application process, please send the following via email to Sam Genway (Scientific Director, Machine Learning and AI) at Sam.Genway@lifearc.org: CV Cover Letter - as well as telling us why you are interested in the project and why you think you would be a good fit, please include a short response to the following questions (max 150 words per question) What interests you about a role within our industry (Translational Science)? What part of LifeArc, or the work we are involved in, do you personally believe in the most, and why? Application Closing: 28 February 2025. If your application is successful, you will be invited to a final stage virtual interview. Full instructions and guidance on how to approach and prepare for the assessment will be provided. We are also proud to be using Rare Recruitment's Contextual Recruitment System (CRS) which allows us to consider your achievements in the context in which they were gained. We understand that not all candidate’s achievements look the same on paper – and we want to recruit the best people, from every background. We would therefore encourage you to submit your contextualised data using the Rare Contextual Recruitment System as part of your application using this link: https://lifearc.app.contextualrecruitment.com/apply/cf4cc979-16f6-435b-922d-ca52259fb839
Work Environment	The student will work at our Stevenage site on their own project, but with regular supervision and with other members of the data sciences group available to talk to about the project.
References
Prerequisite Skills	Statistics, Predictive Modelling
Other Skills Used in the Project	Image processing
Acceptable Programming Languages	Python

Veterinary Data Analysis

Project Title	Veterinary Data Analysis
Keywords	1. Quantitative & Qualitative Analysis – The project involves both numerical and non-numerical data, requiring mathematical expertise to extract meaningful insights. 2. Statistical Skills – A strong background in statistics is essential to analyse financial and survey data effectively. 3. Incomplete Data Handling – The dataset has gaps, requiring mathematical techniques to manage missing information and ensure accurate interpretations. 4. Impact Assessment – The analysis will measure the effect of specialist advice on confidence, well-being, and commercial outcomes, aligning with mathematical modeling and evaluation skills. 5. Structured Reporting – Findings must be organised in a clear, data-driven written format, a key strength of mathematicians skilled in data communication.
Project Listed	28 March 2025
Project Status	Open
Contact Name	Richard Artingstall
Contact Email	richard.artingstall@vet-ct.com
Company/Lab/Department	VET.CT
Address	Hauser Forum, 21 JJ Thomson Ave, Cambridge CB3 0FA
Project Duration	8-10 weeks full time
Project Open to	Undergraduates, Master's (Part III) students
Background Information	VET.CT is a global leader in teleradiology and case advice services for veterinary practices. We are seeking a masters equivalent student in Mathematics, Statistics, or a related field for a paid opportunity, to provide detailed quantitative and qualitative data analysis and a written report summarising findings and insights. We recently conducted a 6-month study across multiple veterinary practices, comparing outcomes with and without one of our key services. This generated a wealth of data, including: Quantitative data re the potential financial benefits to vet practices Qualitative data re the potential emotional benefits for vets, such as increased confidence
Project Description	We want to analyze our dataset to gain insights into whether providing specialist advice to different groups impacts confidence, well-being, and commercial performance for the businesses that use our services. Since the data is incomplete in some areas, we need someone who can think laterally and work creatively to interpret the information. A thorough analysis could have a significant impact, improving the support and well-being of veterinary professionals while also enhancing the care and outcomes for the animals they treat.
Work Environment	The student will be working on their own but can use our offices and feel like part of the team. They will liaise with the Director of Teleconsulting and the Head of Marketing
References
Prerequisite Skills	Requirements: Strong analytical and statistical skills Experience with survey and financial data analysis Ability to organise findings in a concise and accessible written format
Other Skills Used in the Project	Statistics
Acceptable Programming Languages	No Preference