Founding Computational Protein Scientist (gn) @ Biotech Venture, Cambridge (UK)

OtherScientist
0 views0 saves0 applied

Quick Summary

Overview

About DropCode DropCode is building the data engine for protein function. Starting with enzymes, we use our patented droplet microfluidics platform to capture exponentially more data on protein function than conventional methods, linking genotype to phenotype at per-droplet resolution, making every…

Key Responsibilities

We are looking for an exceptional founding computational scientist to lead our machine learning and protein modelling efforts.

Requirements Summary

Undergraduate grounding in hard science (mathematics, physics, or computer science), with that rigour subsequently applied to biological problems PhD in machine learning, deep learning, or a closely related computational discipline A track record of…

Technical Tools
deep-learningmachine-learning

DropCode is building the data engine for protein function. Starting with enzymes, we use our patented droplet microfluidics platform to capture exponentially more data on protein function than conventional methods, linking genotype to phenotype at per-droplet resolution, making every droplet a micro test tube. This data fuels machine learning models that learn in ever greater detail how sequence determines function. Our wedge is enzyme engineering for biocatalysis and industrial biotechnology, but our ambition is to make DropCode the definitive platform for protein function prediction.

We are Cambridge PhDs with deep expertise across microfluidics, biochemistry, machine learning, optics, and engineering. We believe the language of biology is machine learning, and that the fastest path to transformative models is not just better AI, it is better inputs.

We are looking for an exceptional founding computational scientist to lead our machine learning and protein modelling efforts. You will own the sequence–function modelling stack end to end: from processing large-scale functional datasets generated in our microfluidic runs, to training and deploying generative and predictive models that drive the next round of experiments. You will work in a tight loop with the biology and engineering teams, turning quantitative phenotypic data into closed-loop active learning systems that continuously improve our models.

This is a foundational role. You will be building the ML infrastructure from the ground up, and your architectural choices will shape DropCode for years.

Responsibilities

~1 min read
  • Design and train sequence–function models on deep mutational scanning datasets and high-throughput screening outputs from our microfluidics platform

  • Develop and iterate generative models (transformers, diffusion models, or equivalent) for enzyme sequence design and optimisation

  • Build closed-loop active learning pipelines that couple ML predictions with experimental design, shortening the design–build–test–learn cycle

  • Model protein fitness landscapes, including epistatic interactions, to navigate high-dimensional sequence space intelligently

  • Partner with the biology team to define the data collection strategy and ensure experimental outputs are ML-ready

  • Establish best practices for model evaluation, benchmarking, and uncertainty quantification in the context of functional prediction

  • Own and grow the computational stack as the team scales

  • Undergraduate grounding in hard science (mathematics, physics, or computer science), with that rigour subsequently applied to biological problems

  • PhD in machine learning, deep learning, or a closely related computational discipline

  • A track record of designing and building custom model architectures from scratch - not just fine-tuning or deploying off-the-shelf systems; ideally applied to biology, but strong work in any demanding applied domain is relevant

  • Demonstrated contribution to a meaningful breakthrough in protein design or sequence–function modelling

  • Proven hands-on experience with protein language models or generative models applied to biological sequences

  • Deep familiarity with deep mutational scanning, large-scale functional datasets, or comparable high-throughput data modalities

  • Strong understanding of fitness landscape theory and epistasis in the context of sequence optimisation

  • Experience building active learning or Bayesian optimisation systems that integrate ML with experimental feedback

  • Excitement at the prospect of working with large volumes of proprietary, quantitative functional data unavailable anywhere else

  • Comfortable operating in the ambiguity of early-stage R&D and motivated by the challenge of building foundational infrastructure

You are frustrated by the slow, artisanal nature of current biological engineering and believe the field needs a step-change in data scale and quality. You think quantitatively, treat every experiment as a data point for a model, and have strong opinions about what it takes to build the best protein design systems in the world. You thrive in collaborative, fast-moving environments where the pace is set by scientific urgency, not process.

Location & Eligibility

Where is the job
Cambridge, United Kingdom
On-site at the office
Who can apply
GB

Listing Details

First seen
May 6, 2026
Last seen
May 9, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
51%
Scored at
May 6, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

jobsatlanticvcfoodlabsFounding Computational Protein Scientist (gn) @ Biotech Venture, Cambridge (UK)