London Prism Digital Ltd 1-2 Paris Garden London SE1 8ND
Platform Engineer 2022-07-14 Platform Engineer | Linux, Python | Digital Biology Our client, who specialises in Digital Biology, is essentially building a pharma company from the ground up with AI at its core. Penpole 2022-08-14

Platform Engineer

£65,000

London
Eman Abobakr

65000 DAY

£65,000

GBP
APPLY NOW BACK TO VACANCIES

Platform Engineer | Linux, Python | Digital Biology

 

Our client, who specialises in Digital Biology, is essentially building a pharma company from the ground up with AI at its core. These guys are building the 'new' GSK/AstraZeneca/Pfizer but built on AI/ML tech! They are looking for a Platform Engineer to join their growing organisation and take ownership of their Linux-based core infrastructure, with some public cloud. This company has been set up by genuine VIPs within the Pharmaceutical and technology worlds. This is such a great opportunity to join as the company's first Platform Engineer.

 

You will be responsible for the software and hardware infrastructure that will support our client's ML Operations. You should be a hands-on Linux person, with Python skills. This role introduces machine learning and the software environment surrounding it and works closely with the Nvidia cluster and partners with the Cambridge 1 supercomputer. MASSIVE computing power here and a unique opportunity to have the best access to some of the highest computing capacity.

 

The business consists of over 15 people growing to 40 by the end of the year, with £25 million in Seed funding!! This role will be based in their London office, but they are also remote-friendly. It is an incredibly progressive and supportive team, and they will certainly provide the conditions for you to thrive as a Platform Engineer.

 

Essential Skills:

  • Linux System Administration - the OS is Ubuntu, but experience in any flavour is good.
  • Python experience is preferred, but if you are strong with another development language this will be considered.

 

Nice to Have Skills:

  • Public Cloud (our client use AWS & Oracle Cloud)
  • Kubernetes

 

Role specifics:

  1. Manage software to execute ML applications at scale. 10s of GPUs to hundreds.
  2. Manage internal cluster running Ubuntu, Cumulus Linux, SLURM and Base Command.
  3. Support scaling of code from single GPU to multi-GPUs for PyTorch, PyTorch-lightning and PyTorch geometric.
  4. Develop systems to support experiment tracking.

 

Technical environment:

  • Need to be very hands-on with Linux (Ubuntu is the OS).
  • Work closely with Nvidia cluster and partnership in Cambridge-1 - 100s of latest GPUs!
  • Public cloud: Oracle Cloud and making the transition to a hybrid model with an internal cluster.
  • Some data science sitting on AWS.
  • Few services i.e. databases in Kubernetes in Oracle Cloud.
  • $650,000 spent on the on-prem environment DGX - hardware servers Ubuntu machines $300k per machine - 2 machines.
  • HGX $150k and has one but will be buying more. Massive servers and switches to support the systems.
  • Bare-metal super fast, super low latency.
  • Looking for someone to take ownership of the systems, and databases, and support the development of the data infrastructure.

 

Projects:

  • Help build the cluster, move it to a data centre and operationalise it (3 months).
  • Support ML training at scale (10s of GPUs internally and hundreds externally).
  • Build the data infrastructure, including a knowledge graph to support ML training at the Terabyte scale.
  • Operationalise the MLOps in a Lab in the loop setting (build the pipeline to define the next round of experiments (Airflow in a loop)).
  • Potential future project: lab automation (robots trained offline in a metaverse setting and run in the lab).

 

Plenty to do and a brand-new green field environment to get your mitts on here!

 

Benefits:

  • Bonus: 10%
  • Stock options
  • Friday socials drinks
  • 2 days a week in the office

 

This is a great chance to work with a growing, forward-thinking company. Apply today for the best chance of success!

 

Platform Engineer | Linux, Python | Digital Biology

Job reference: #BBBH10166