Machine Learner Expert

This job has now expired please search on the home page to find live IT Jobs.

This is a full-time, fixed term (18 months) position on Crick terms and conditions of employment.

Summary

We are planning to conduct the so far largest human proteome study, and measure plasma samples for up to 12,000 individuals. The data generated will help the prediction of human metabolic diseases, such as Type II diabetes or fatty liver disease. You will provide expert guidance in how to process, normalize and analyse data from large-scale proteomic studies and collaborate with internal and external scientists to generate and test hypotheses. To be considered for this role you will have a hands-on experience of analysing massive datasets and apply/implement state of the art machine learning algorithms.

Project summary

You are joining the team of scientists conducting so far, the largest proteomics experiment. People of UK population have been participating followed for over two decades, blood samples have been collected, and their medical history been recorded. We have now developed a technology, that allows the analysis of this large sample collection suitable for training machine learning algorithm to predict disease related phenotypes.

You will be part of the team leading data analysis, data management, development of algorithms for understanding how molecular data can be used to predict disease prognosis. The position is temporary and with your dedicated input, the project has a large potential to become a spin-off company translating the state-of-the-art proteomics technology for personalized medicine applications and you will become a part of it.

Key responsibilities

These include but are not limited to:

To develop expertise in the analysis of mass spectrometry-based ‘omics' data sets with a focus on proteomics data

Develop a system for data handling and management

Assess and implement new supervised/unsupervised learning algorithms

To maintain all code and programs used for analysis and tool development with source control

To maintain accurate and complete records of analysis projects, teamwork

Present results and conclusions to internal and external stakeholders

Ability to communicate through graphical representation/visualizations, reports, algorithms, models, and dashboards

Use machine learning methods (including deep neural networks) to model and predict research outcomes

Key experience and competencies

The post holder should embody and demonstrate our core Crick values: bold, imaginative, open, dynamic and collegial, in addition to the following:

Essential

Qualifications, experience and competencies:

Advanced degree (PhD) in Machine Learning, Data Science, AI, Computer Science, Computer Engineering, Electrical Engineering, Physics, Statistics, Applied Math or other quantitative fields are preferable or extensive proven academic/industrial experience in quantitative field

Experience working in data science, and/or predictive modeling

Demonstrated ability to lead and execute projects from start to finish

Ability to independently support and integrate to existing projects

Proven track record in modifying and applying advanced machine learning algorithms to address practical problems

Proficient in deep learning (CNN, RNN, LSTM, attention models, etc.), machine learning (SVM, GLM, boosting, random forest), graph models, and/or, reinforcement learning

Experience with open source tools for deep learning and machine learning technology such as Keras, tensorflow, pytorch, scikit-learn, pandas, etc.

Proven ability to work independently on development of complex models with extremely large and complex data structures

Proficient in more than one of Python, R, C++, or C

Experience in large data analysis using Spark

Robust knowledge and experience with statistical methods

Desirable

Qualifications, experience and competencies:

Have a previous experience working with mass spectrometry

Experience working with unsupervised methods

Experience with Hadoop and NoSQL related technologies such as Map Reduce, Spark, Hive, HBase, mongoDB, Cassandra, etc.

Experience with GPU programming

Experience with Agile methods for software development