Title : Protein regression models as cornerstone of AI-guided protein evolution
Abstract:
Protein engineering plays a central role in developing biocatalysts for biotechnology, biomedicine, and life science. Over recent years it has evolved significantly with the integration of machine learning (ML) techniques. Our study focuses on the application of ML algorithms I enhancing biocatalyst functionalities, including enzyme stability, function, and solubility. We have pioneered the use of ML algorithms as effective tools in protein engineering, specifically targeting biocatalysts. Our methodology involves a two-step ML model application. Initially, our models proficiently predict protein sequence-to-function mappings. The approach does not require but can integrate detailed mechanistic or structural data. The application has proven particularly effective in low data regimes, even when only a few dozen functionally assayed sequence variants are available. Subsequently, we employ these predictions in a Bayesian optimization framework to guide the selection of candidates for experimental validation. This process allows simultaneous optimization of multiple parameters, such as stability, catalytic speed, and substrate specificity. A notable achievement of our research is the superior performance of our prediction algorithms. They consistently outperform current state-of-the-art methods, including a recent algorithm developed by Novartis, across various datasets and benchmarks. The practical applicability of our algorithms was further validated through successful protein engineering campaigns, enhancing the functionality of complex enzymes like carboxylases, hydrogenases, and phosphohydrolases. Our findings underscore the potential of ML methods in expediting directed evolution and rational design of proteins. By harnessing the power of existing sequence variant data, these methods effectively predict and select sequences with enhanced properties. During the talk, we will delve into these advancements in detail, highlighting practical applications, limitations and their significant impact on the future of protein engineering.
Audience Take Away
- The growing importance of machine learning in protein engineering: This presentation will deepen the audience's understanding of how machine learning is revolutionizing protein engineering, highlighting a specific methodology that is transforming current practices in the field.
- Enhancing protein engineering with regression models: Gain insights into how advanced protein regression models can significantly elevate the outcomes of protein engineering projects.
- Efficient variant extrapolation: Learn how recent advancements in these regression models enable the use of only a few dozen functionally assayed protein variants to extrapolate to superior sequences efficiently.
- Orthogonal approach to conventional techniques: Understand how such approaches offer a unique, orthogonal approach, complementing and overcoming limitations of traditional methods like directed evolution (restricted by trial-and-error processes) and structure-guided rational design (dependent on precise mechanistic knowledge).