GSOC 2024 Project: Porting Smiles2Vec from Tensorflow to Pytorch

Hey Everyone,
My name is Harishwar, I am a 4th year Undergraduate student at IIT Kharagpur. I am excited to document my learning as a part of GSOC 2024 where I will be working on Porting Smiles2Vec Model in the Deepchem library, from Tensorflow to Pytorch.

I shall be updating my progress here over the summer on a weekly basis! Stay tuned!

WEEK 1 UPDATE:

  • Trying to run the Smiles2Vec torch model on a colab notebook as a first step
  • Regression mode worked absolutely fine for first few epochs but I outputs NaN when epochs are high
  • Feedback received on solving that : decreasing learning rate, loss or weights could have shot up.

Week - 2 Updates, (7 - 14 June)

  • Implemented overfitting test successfully

  • Linting partially done, still a few changes are left

  • Trying to solve exploding gradients issue. I am getting NaN after a few epochs of training

  • Used torch.isnan() to to identify where the values turn into NaN. It was after an LSTM layer.

  • Trying to use torch.clip_grad to prevent gradients from exploding.

  • To-do: save reload test

Week - 3 Updates, (14 - 21 June)

  • Implemented test for nn.module class
  • followed all the deepchem coding conventions by running YAPF, flake8, doctest
  • the NaN issue was due to problem with local environment, so tested it in standard environments
  • To-do: save reload test

Week - 4 Updates, (21 - 28 June)

  • Wrote the test for the code written so far, i.e test to check the forward function of nn.module class
  • Made my first PR! ( link: https://github.com/deepchem/deepchem/pull/4018 )
  • Fixed a few type annotation issues after the CI had run, and gotten it reviewed by my mentor
  • to-do: getting the PR merged

Week - 5 Updates, (29 June - 5 July)

  • Made a few doc fixes in the PR
  • Reabased my PR since there were a few merge conflicts! ( link: https://github.com/deepchem/deepchem/pull/4018 )
  • Fixed a few type annotation issues after the CI had run, and gotten it reviewed by my mentor
  • to-do: getting the PR merged

Week - Updates, (5 July - 12 July)

  • written overfitting tests for regression and classification mode
  • writtend save - reload test
  • gotten my PR reviewed by my mentor
  • To-do: Getting the PR merged