Torch compile and PyTorch 2.2.0 | GSoC 2024

Hey everyone,

I will be working on the project “Torch compile and PyTorch 2.2.0” as part of GSoC 2024. The project involves adding a feature and relevant tutorials to the DeepChem library for compiling PyTorch models for faster training and inference using torch.compile. More details of the project is present at the official GSoC project page https://summerofcode.withgoogle.com/programs/2024/projects/Xwr0Z1tZ.

PyTorch 2.0 introduced torch.compile, a compiler that optimizes PyTorch models for faster training and inference by compiling the code into efficient kernels, offering advantages over existing methods like TorchScript or FX tracing. By introducing a way to compile DeepChem models, users will be able to take advantage of the efficient training and deployment of models offered by torch.compile. For more information about torch.compile you can visit the official PyTorch docs or read the get-started blog of PyTorch 2.0 (https://pytorch.org/get-started/pytorch-2.0/) that gives an excellent explanation of the rationale, the internal workings and why such a feature was required.

Over the span of GSoC, I will be using this forum thread to track my progress. I will post weekly updates about the project to this thread along with other relevant details such as the resources I referred to and the design decisions I make.

Feel free to reply to this thread for any further clarification regarding the project or the following updates.

You can connect with me via my
Twitter: gaushn_
Discord: ambient_pressure_perceptron

Weekly Update #1 (May 27 - June 2):

  • Made a colab notebook for compiling and benchmarking DMPNN model using a functional implementation of torch.compile for DeepChem models - Colab

  • Made a PR for adding torch.compile to DeepChem’s TorchModel class - PR

The PR only consists of the basic implementation of the compile function, supporting the backend inductor (default backend) and modes default and max-autotune-no-cudagraphs. Adding support for other modes and backends will require including additional soft dependencies to DeepChem such as triton which will be added in the following PRs.

Weekly Update #2 (June 3 - June 8):

Started working on the follow-up PR but wasn’t able to make much progress as I was busy with academics. Will catch up on the work in the coming week.

Weekly Update #3 (June 9 - June 14):

  • Opened another PR for adding support for the rest of the modes for the compile function to Deepchem - PR

  • Started working on the tutorial for compiling Deepchem PyTorch models using the added compile function.

Once the above linked PR is merged, we will be able to use all the available modes for compiling the models. The modes reduce-overhead and max-autotune need triton to be installed.

Weekly Update #4 (June 15 - June21):

  • Got the PR Opened last week merged.

  • Wrote the draft tutorial for using compile() function in DeepChem

  • Discussed with mentor on the next steps in the project. Some of the possibilities are:

    • Run profiling on DeepChem training function and check which parts of the code are causing the most overhead.

    • Look into libraries like PyTorch ao and torchmd and how to integrate them into DeepChem.

1 Like

Weekly Update #5 (June 22 - June29):

  • Opened a PR for adding the tutorial.
  • Started going through the PyTorch tutorials for profiling torch compile and Python code.

I got an initial review from other GSoC contributors and my mentor on the tutorial and updated it accordingly. The initial draft assumed much prior knowledge from the reader such as graph capture in neural networks. I modified the tutorial to mention the assumption of prior knowledge and also linked to other resources that can referred to get an idea about them. Few more modifications were also made to make the tutorial more clear highlighting how compile() function can be used effectively.

For the future works that were discussed last week, I decided to go with profiling torch compile and DeepChem fit() function for now. I started looking into tutorials for both and ran some sample scripts.

Weekly Update #6 (June 29 - July 5):

The tutorial for using compile() was merged and is available on the DeepChem tutorials page here.

Wasn’t able to work on the project as my semester exams are going on.

Weekly Update #7 (July 6 - July 12):

Wasn’t able to work on the project as my end-semester exams are going on. Will continue working from 18th when the exams are over.