Exascale Matrix Factorization: Using Supercomputers and Machine Learning for Drug Discovery
Tuesday, 21st May @ 14:00 BST / 15:00 CEST
In this webinar we will present the ExCAPE project, the Bayesian Matrix Factorization techniques used and how the POP CoE gave us crucial insights into the scaling bottlenecks of our code and so helped us remove them. We will also present how the HPC infrastructure and implementations were crucial to giving insights that helped the pharma industry in their drug discovery process.
In the ExCAPE European funded project we investigated the power of supercomputers to speed up drug discovery using machine learning. One of the machine learning algorithms was “Matrix Factorization” (MF). MF is a core machine learning technique for applications of collaborative filtering, such as recommender systems. In drug discovery it can be used to predict the interaction between chemical compounds and protein targets.
The Matrix Factorization technique studied in ExCAPE uses Bayesian Matrix Factorization (BMF). While BMF has the advantage of being able to provide confidence estimates it is more computationally intensive. Therefore, a high-performance parallel implementation of BMF, that is suitable for HPC, was needed. With the help from the POP CoE, this implementation was developed and optimized. It allowed us to discover new insights in compound-protein interaction thanks to the large-scale models built on datasets that were previously intractable.
• ExCAPE: http://excape-h2020.eu
• BPMF: https://github.com/ExaScience/bpmf
About the Presenter
Tom Vander Aa is a researcher/project coordinator in the ExaScience Life Lab at imec. This lab creates new supercomputer solutions to generate breakthroughs in life sciences and biotechnology. His interests are in software engineering for high-performance computing and machine learning. Before joining the ExaScience Lab he was at Target Compiler Technologies and at the architecture and compiler group in imec working on low energy high performance architectures and compilation techniques.