Data Pipeline Engineer

This job has now expired please search on the home page to find live IT Jobs.

Data Pipeline Engineer

Overview

Our team is delivering cutting-edge machine learning and data science solutions to solve challenging problems across art, crypto-art, law, and the social media sectors. We collect huge amounts of data by reviewing websites, scanning books, and recording videos.

We are looking to grow our engineering team to manage the large amounts of data produced on the web, social media and blockchains. You would be joining a strong team of 30+ people (data scientists, engineers, and analysts), in a fast-paced and collaborative team environment.

As a data engineer, you will be responsible for:

Writing data pipelines in Python or R to extract data from blockchain & service APIs into relational databases.

Developing tools to monitor existing data pipelines and web scrapers.

Developing tools to monitor data quality in various databases.

Improving the data quality of the databases by working on hotfixes with the help of QA & fellow data engineers/data scientists.

Deploying machine learning models on a secured public facing API via Kubernetes/Kubeflow/Ray Serve.

Writing and securing public facing APIs on Kubernetes to allow external partners to access our services.

Work with data scientists in deploying the following on Kubernetes/AWS cloud

building datasets and training models

prediction performance monitoring of models already deployed and serving

online machine learning pipelines

Monitoring load on databases and suggesting ways to optimise SQL queries.