Developing Modality Support for PROTACs | GSoC 2024

Hi DeepChem Community!

I’m David, and I am one of the GSoC contributors this year! My project will focus on improving modality support for PROTACs, an emerging therapeutic with the potential to change the way we think about drug discovery. I will build out infrastructure to support analysis of PROTAC molecules and tutorials to provide a primer into the fascinating world of PROTACs and targeted protein degradation! If you would like to learn more, please check out the official project description at https://summerofcode.withgoogle.com/programs/2024/projects/HadcZO5q. In the meantime, follow along this forum thread as I post updates on my progress, and engage with the community!

Progress Report 1: PROTAC Starter Tutorial

Hi everyone, to start off my project, my mentor (David Figueroa) and I thought it would be good to provide everyone curious about PROTACs with an introduction tutorial notebook which goes through the basic sciences and explores how we can infer degradation rates of PROTACs, a crucial property towards determining efficacy.

PR 1: PROTAC tutorial:

  • A brief introduction into the basic sciences underlying PROTAC degradation
  • Exploring PROTAC-DB dataset
  • Featurization of PROTAC SMILEs representations
  • A simple MLP model to regress DC50 properties of PROTACs

Link to PR

If you are interested, please take a look and leave any feedback you may have! In the meantime, stay tuned as I explore other featurization techniques as we build towards a second tutorial!

Progress report 2:

Hi everyone, this week I made a few changes to the PROTAC starter tutorial above. I added explanations of the science behind PROTACs to improve the readability of the tutorial along with a few changes to the overall flow. I also met with my mentor David Figueroa to discuss how we would proceed with implementation of a protein sequence featurizer. The sequence featurization space is very rich with literature, and I would love to explore it in the context of PROTAC design and inference.

Feel free to reach out if you have any feedback or questions. Until next week!

Hey everyone!

The first tutorial diving into the world of PROTACs is now live. You can check it out here! Most of this week was spent finalizing this tutorial with my mentor David as well as Bharath. Huge shout outs to them for their feedback and comments.

Additionally, as an addendum to last week’s post, we have decided to shift our focus from implementing a sequence featurizer to exploring the available tools for linker design in PROTACs! We believe this will be of great resource to the scientific community so stick tune for updates regarding a tutorial on that!

Hey everyone,

As per my previous update, this week was focused on doing a deep-dive into the world of PROTAC linker design. There are a variety of tools out there from graph to RL and from SMILEs to 3D conformation generation. We are currently working on a tutorial to explore linker design using one of these tools, and also finetuning one of the pretrained linker-design methods on PROTAC-DB. Stay tuned for more updates on the tutorial soon!

Hey everyone,

Updates for this past week:

  1. The first part of the tutorial has been finished. I go over how we can use the DeLinker tool to do linker design on PROTACs from PROTAC-DB.
  2. I will be meeting with a post-doc from the Craig Crews lab at Yale to discuss computational limitations of PROTAC design. Definitely hope to get a lot out of this meeting.

Hi everyone,

Here are the updates for this week. I added background literature to the tutorial alongside some notes from my meeting earlier this week. The meeting was really productive, and here are some pointers I got from a medicinal chemist who is in the lab everyday designing these cool molecules!

  • Preserving spatial orientation of the ligands is the most important part of designing a linker. You can have mediocre binding and still achieve degradation.
  • Linker design, of course, depends heavily on the target protein and its binding site (e.g. kinases vs TFs)
  • Without lysine on the target protein surface, it is extremely hard for ubiquitination to occur

Stay tuned for when I post the full tutorial with a more comprehensive summary of the meeting!