Torch compile and PyTorch 2.2.0 | GSoC 2024

gaushn · July 1, 2024, 7:24pm

Hey everyone,

I will be working on the project “Torch compile and PyTorch 2.2.0” as part of GSoC 2024. The project involves adding a feature and relevant tutorials to the DeepChem library for compiling PyTorch models for faster training and inference using torch.compile. More details of the project is present at the official GSoC project page https://summerofcode.withgoogle.com/programs/2024/projects/Xwr0Z1tZ.

PyTorch 2.0 introduced torch.compile, a compiler that optimizes PyTorch models for faster training and inference by compiling the code into efficient kernels, offering advantages over existing methods like TorchScript or FX tracing. By introducing a way to compile DeepChem models, users will be able to take advantage of the efficient training and deployment of models offered by torch.compile. For more information about torch.compile you can visit the official PyTorch docs or read the get-started blog of PyTorch 2.0 (https://pytorch.org/get-started/pytorch-2.0/) that gives an excellent explanation of the rationale, the internal workings and why such a feature was required.

Over the span of GSoC, I will be using this forum thread to track my progress. I will post weekly updates about the project to this thread along with other relevant details such as the resources I referred to and the design decisions I make.

Feel free to reply to this thread for any further clarification regarding the project or the following updates.

You can connect with me via my
Twitter: gaushn_
Discord: ambient_pressure_perceptron

gaushn · July 1, 2024, 7:24pm

Weekly Update #1 (May 27 - June 2):

Made a colab notebook for compiling and benchmarking DMPNN model using a functional implementation of torch.compile for DeepChem models - Colab
Made a PR for adding torch.compile to DeepChem’s TorchModel class - PR

The PR only consists of the basic implementation of the compile function, supporting the backend inductor (default backend) and modes default and max-autotune-no-cudagraphs. Adding support for other modes and backends will require including additional soft dependencies to DeepChem such as triton which will be added in the following PRs.

gaushn · June 7, 2024, 7:38pm

Weekly Update #2 (June 3 - June 8):

Started working on the follow-up PR but wasn’t able to make much progress as I was busy with academics. Will catch up on the work in the coming week.

gaushn · July 1, 2024, 7:24pm

Weekly Update #3 (June 9 - June 14):

Opened another PR for adding support for the rest of the modes for the compile function to Deepchem - PR
Started working on the tutorial for compiling Deepchem PyTorch models using the added compile function.

Once the above linked PR is merged, we will be able to use all the available modes for compiling the models. The modes reduce-overhead and max-autotune need triton to be installed.

gaushn · July 1, 2024, 7:24pm

Weekly Update #4 (June 15 - June21):

Got the PR Opened last week merged.
Wrote the draft tutorial for using compile() function in DeepChem
Discussed with mentor on the next steps in the project. Some of the possibilities are:
- Run profiling on DeepChem training function and check which parts of the code are causing the most overhead.
- Look into libraries like PyTorch ao and torchmd and how to integrate them into DeepChem.

gaushn · June 28, 2024, 7:50pm

Weekly Update #5 (June 22 - June29):

Opened a PR for adding the tutorial.
Started going through the PyTorch tutorials for profiling torch compile and Python code.

I got an initial review from other GSoC contributors and my mentor on the tutorial and updated it accordingly. The initial draft assumed much prior knowledge from the reader such as graph capture in neural networks. I modified the tutorial to mention the assumption of prior knowledge and also linked to other resources that can referred to get an idea about them. Few more modifications were also made to make the tutorial more clear highlighting how compile() function can be used effectively.

For the future works that were discussed last week, I decided to go with profiling torch compile and DeepChem fit() function for now. I started looking into tutorials for both and ran some sample scripts.

gaushn · July 22, 2024, 7:58pm

Weekly Update #6 (June 29 - July 5):

The tutorial for using compile() was merged and is available on the DeepChem tutorials page here.

Wasn’t able to work on the project as my semester exams are going on.

gaushn · July 12, 2024, 5:51pm

Weekly Update #7 (July 6 - July 12):

Wasn’t able to work on the project as my end-semester exams are going on. Will continue working from 18th when the exams are over.

gaushn · July 20, 2024, 7:03pm

Weekly Update #7 (July 13 - July 20):

Opened another PR adding some more contextual information on the need for optimization to the tutorial.
Profiled a few compiled and uncompiled models’ training by running cProfile on fit() function. Also visualized the profiled results using SnakeViz.
Started making a doc explaining the results obtained after profiling.

SnakeViz cannot be run from colab, so sharing an interactive online notebook won’t be possible. Instead I’ll share the notebook next week that can be downloaded and run locally to visualize the profiling results.

gaushn · July 28, 2024, 6:57pm

Weekly Update #9 (July 20 - July 26):

Completed the document showing the results after profiling few DeepChem models. You can find the document here: https://docs.google.com/document/d/16h1ypzS2roEzT0ZBmx2y0kau4XMaprcXxyy1UePP_5A/edit
Some of the conclusions that I came to after looking at the results are:
- The profiling results for the fit() function are similar to the expected results. Backward pass, preparing batches and forward pass take the most time when training a model.
- In _prepare_batches function, most of the time is taken for moving the data to GPU.
- There is no significant overhead caused by DeepChem in ChemBERTa over the base hugging face model as I had thought before. The difference in time shown previously was due to a bug in how the moels were benchmarked. The bug is fixed now and both the models have similar inference times.

I will present the results to my mentor next week and decide on how to proceed.

gaushn · August 4, 2024, 5:25pm

Weekly Update #10 (July 27 - August 2):

The previous PR on updating the tutorial got merged.
Started working on open issues in DeepChem related to model and other function performances. Since I have worked on benchmarking model performances for compilation and later also profiled the fit() of different models in Deepchem, it will be easier for me to work on these issues as I can reuse the tools and functions that I used before for measuring the performance.

Wasn’t able to do much work this week as I was travelling. Will work on solving the open issues the coming week.

gaushn · August 10, 2024, 5:53pm

Weekly Update #11 (August 3 - August 9):

Opened another PR implementing the change suggested in this issue — to have an easy way to diable caching in DiskDataset.
I initially planned to add cache_data as a property that can be used for enabling or disabling the cache. However, after discussing this with the mentors, I updated the PR to enable cache_data to be set in the initializer, which can be modified as a property later on.
The problem with adding cache_data as an initializer and field variable would be that, if the field state is changed from True to False after initialization, the already cached data will not be cleared from the memory.
Im currently waiting for the review on the PR and working on other open issues.

gaushn · August 17, 2024, 3:35pm

Weekly Update #12 (August 10 - August 16):

Started working on the final GSoC Report.
Added the content about the project overview, integrating torch.compile() and the tutorial into DeepChem to the report. Also included the obtained results in benchmarking.
Currently working on adding the profiling results to the report.

I’m planning to finish the report in two days and get it reviewed by my mentor. Once that is done, I’ll continue working on the open PRs and issues.