Search Jobs

Data Engineer

San Francisco Metro Area, CA

Posted: 12/03/2018 Job Type: Engineering Job Number: JN -122018-23171
You will work with computational and research scientists to define strategies and implement systems for modeling, collecting, storing, and accessing diverse scientific data and metadata. Collaborating with other scientists and engineers, you will design, build, and maintain databases and data warehouses that underpin our scientific endeavors and accelerate our ability to ask new, sophisticated questions spanning multiple organisms, data modalities, and timescales. You will not only build tools to support existing scientific workflows, but also help set the vision for future data generation and collection efforts.
If you are passionate about data, passionate about biology, and passionate about their intersection this is the job for you.

Responsibilities:
  • Work with computational and research scientists to understand common analysis use cases and data access needs.
  • Design strategies for data storage and integration across different data sources (both internal and external) for multiple use cases.
  • Implement, document, and maintain processing pipelines, databases, and data warehouse infrastructure.
  • Work closely with full-stack engineers to develop APIs and GUIs for accessing and visualizing scientific data.
  • Set data engineering vision and drive both independent and collaborative software development projects end-to-end.
  • Contribute to a range of projects, from one-off solutions to long-term, complex systems.
  • Build out core infrastructure, tooling, and software development processes.
Requirements:
  • 5+ years working with contemporary ETL tools and frameworks.
  • 3+ years building Python-based back-end systems.
  • Fluent knowledge of SQL.
  • Experience implementing RESTful APIs, GraphQL, and other programmatic interfaces to complex multidimensional data.
  • Experience deploying high-performance data back-ends in the cloud with Amazon Web Services, Heroku, Google Cloud Platform, or a similar service.
  • Firm grasp on software testing and test-driven development.
  • Demonstrated success in owning projects end-to-end, including working with non-technical stakeholders to define requirements and seek feedback.
Nice to have:
  • Worked with machine learning tools and infrastructure, e.g. TensorFlow and PyTorch.
  • Built back-ends for high-dimensional graph or network data.
  • Worked in biology or life sciences, and have familiarity with databases and data types used by computational biologists.
  • Built software with technologies like ElasticSearch, GraphQL, and Google Cloud Platform.
Apply Online

Send an email reminder to:

Share This Job:

Related Jobs:

Login to save this search and get notified of similar positions.