Foresite Capital | Data Infrastructure Engineer | Onsite | SF
https://hire.withgoogle.com/public/jobs/foresitecapitalcom/v...
Role
Foresite data science is a translational R&D team that derives insights from precision measurement and population-scale biology to address unmet clinical needs. Through a combination of external investment and company incubation, our goal is to create the preeminent portfolio of companies at the interface of data science and healthcare.
Within data science, Foresite’s Platform Team develops the methods and infrastructure to solve key scientific and clinical problems. We combine deep biological knowledge with rigorous statistical genetics and modern engineering practices to develop and critically evaluate therapeutic and interventional hypotheses. We are particularly focused on the combination of novel statistical and machine learning methods to produce reliable insights about causal factors in disease at an unprecedented scale. This work supports critical investment decisions and supplies a core around which new ideas are de-risked and incubated.
We are looking for software engineers with experience building large-scale data systems, familiarity with biological data, and an eagerness to learn. We offer a flexible work environment, a diverse set of projects, and a best-in-class peer group to learn from. This is a great opportunity to tackle a unique set of problems while shaping the future of healthcare.
Responsibilities
- Develop, maintain, and optimize in-house ETL pipelines for diverse biological datasets
- Build and refine versatile, usable systems for genetic analysis at scale
- Work with data scientists and biologists to match engineering requirements to user needs
- Design and implement continuous integration and automated deployment for pipelines and distributed analysis environments
- Distribute tools and results through open-source contributions, as relevant
- Work in close peer-to-peer collaboration with multidisciplinary colleagues throughout Foresite
Qualifications
- 2-6+ years of relevant development and data engineering experience with rigorous engineering practices, including testing, code review, and requirements-based design.
- Experience with cloud platforms, such as AWS or GCP.
- Deep experience building, optimizing, and maintaining large-scale data pipelines using the Hadoop/Spark ecosystem or similar.
- Strong understanding of SQL, including experience working with one or more RDBMS (e.g., PostgreSQL or MySQL).
- Deep development experience with one or more general-purpose languages (e.g., C++ or Scala).
- Strong skills with one or more scripting languages (e.g., Python).
- Exposure to statistical languages (e.g., R) useful, but not essential.
- BA/BS in computer science or equivalent (minimum). Masters, PhD, or other further education or experience valued though not required.
Foresite Capital is an equal opportunity employer. We thrive on diversity and collaboration.