GSOC 2024 Project: Porting Smiles2Vec from Tensorflow to Pytorch

harishwar017 · May 31, 2024, 4:35pm

Hey Everyone,
My name is Harishwar, I am a 4th year Undergraduate student at IIT Kharagpur. I am excited to document my learning as a part of GSOC 2024 where I will be working on Porting Smiles2Vec Model in the Deepchem library, from Tensorflow to Pytorch.

I shall be updating my progress here over the summer on a weekly basis! Stay tuned!

harishwar017 · June 7, 2024, 4:26pm

WEEK 1 UPDATE:

Trying to run the Smiles2Vec torch model on a colab notebook as a first step
Regression mode worked absolutely fine for first few epochs but I outputs NaN when epochs are high
Feedback received on solving that : decreasing learning rate, loss or weights could have shot up.

harishwar017 · June 14, 2024, 5:36pm

Week - 2 Updates, (7 - 14 June)

Implemented overfitting test successfully
Linting partially done, still a few changes are left
Trying to solve exploding gradients issue. I am getting NaN after a few epochs of training
Used torch.isnan() to to identify where the values turn into NaN. It was after an LSTM layer.
Trying to use torch.clip_grad to prevent gradients from exploding.
To-do: save reload test

harishwar017 · June 24, 2024, 4:24pm

Week - 3 Updates, (14 - 21 June)

Implemented test for nn.module class
followed all the deepchem coding conventions by running YAPF, flake8, doctest
the NaN issue was due to problem with local environment, so tested it in standard environments
To-do: save reload test

harishwar017 · June 29, 2024, 2:18am

Week - 4 Updates, (21 - 28 June)

Wrote the test for the code written so far, i.e test to check the forward function of nn.module class
Made my first PR! ( link: https://github.com/deepchem/deepchem/pull/4018 )
Fixed a few type annotation issues after the CI had run, and gotten it reviewed by my mentor
to-do: getting the PR merged

harishwar017 · July 5, 2024, 4:02pm

Week - 5 Updates, (29 June - 5 July)

Made a few doc fixes in the PR
Reabased my PR since there were a few merge conflicts! ( link: https://github.com/deepchem/deepchem/pull/4018 )
Fixed a few type annotation issues after the CI had run, and gotten it reviewed by my mentor
to-do: getting the PR merged

harishwar017 · July 12, 2024, 3:57pm

Week - Updates, (5 July - 12 July)

written overfitting tests for regression and classification mode
writtend save - reload test
gotten my PR reviewed by my mentor
To-do: Getting the PR merged

harishwar017 · July 20, 2024, 10:12am

Week - 7 Updates, (13 July - 19 July)

Removed the overriding default_generator function in my PR, since it wasn’t necessary.
Made my get_dataset function create one_hot encodings which was previously done by the overriding default_generator
Added the model to model cheatsheet
gotten my PR reviewed by my mentor
To-do: Getting the PR merged

harishwar017 · July 29, 2024, 3:05pm

Week - 8 Updates, (20 July - 27 July)

Working on implementing test to check if the torch model and the already existing tf_model output the same results for equal weights
running into an issue with GRU layer, since the keras layer by default has output activation = tanh and recurrent activation = sigmoid.
might have to implement a custum GRU cell and layer with the desired activation functions
To-do: Discussing this with mentor

harishwar017 · August 6, 2024, 6:31pm

Week - 9 Updates, (27 July - 4 July)

Working on implementing test to check if the torch model and the already existing tf_model output the same results for equal weights
was running into an issue with GRU layer, and finally resolved it by changing from .mps() backend to .cpu()
.mps() and .cpu() computations are different and keras models by default are using .cpu() while torch models use .mps() if available
To-do: check for other layers

harishwar017 · August 12, 2024, 4:44pm

Week - 10 Updates, (4 August - 11 August)

Implemented test to check if the torch model and the already existing tf_model output the same results for equal weights
was running into an issue with GRU layer, and finally resolved it by changing from .mps() backend to .cpu()
Made the commits to the PR
To-do: get the PR reviewed