Google Summer of Code 2022: Porting of Normalizing Flow Model from Tensorflow to Pytorch

About me:

Hello everybody!

My name is Jose Antonio Siguenza and I’ll be working during this summer as a contributor in DeepChem via Open Chemistry in GSoC 2022.

Currently, I’m cursing my last year studying Chemical Engineering in Ecuador. Also, I’ve been involved in research about topics related to data science, deep learning, and science at university. Over the past year, I’ve gotten to know DeepChem and I started contributing in December 2021. In general, this has been a fascinating learning path about DL models and their applications in life sciences, especially generative models. DeepChem’s community and documentation have made this journey more enjoyable.

Project Description:

Lately, DeepChem has decided to port some TensorFlow models to PyTorch. This project aims to successfully migrate one of the organization’s porting lists. This model will perform a mix of invertible transformations between the base and target distribution. In order to optimize the results, a Normalizing Flow object initialized with the flow layers and base distribution would be iterated computing the loss for each epoch. For this reason, transformations and flow layers will be created with their respective unittests. Finally, the implementation will include the respective documentation and a tutorial, if applicable, following DeepChem’s API.

Contact:
GitHub:@JoseAntonioSiguenza
E-mail : jasiguen@espol.edu.ec - jasiguenespol@gmail.com

2 Likes

June 1, 2022: PR #2918 on NormalizingFlow and Affine classes

First PR which includes the NormalizingFlow model and Affine classes. This last one is a layer (bijective transformation) commonly used. Affine transformation is based on a geometric operation expressed as y = exp(a)*x + b, where a and b are known as scale and shift parameters, respectively. The logarithm of the determinant jacobian matrix is computed. This term is important to consider in the optimization loop. There are 11 failing checks about format and tests.

Given Feedback:

  • Split the PR into smaller ones
  • Add docstrings, type annotations, and tests
  • Add new layers/models to the docs folder

June 10, 2022: Splitted initial PR and construct Affine class

Split NormalizingFlow and NormalizingFlowModel classes and focus on the Affine class.

Hands-on:

June 17, 2022: PR #2944 on NormalizingFlow model

Got merged in #2918, which is a bijective transformation (Affine transformation) used in the Normalizing Flow model as a layer. Also, I’ve opened #2944, this PR mainly contains documentation and a draft of a NormalizingFlow model, type annotations, docstrings, and code conventions.

To do:

  • Add unittests
  • Add doctests
  • Improve docstrings (too general)
  • Update docs folder (models.rst)
1 Like

June 24, 2022: PR #2944 on NormalizingFlow model

This week, the successful completion of the model was achieved. This is to compose the transformation of the layer as an nn.ModuleList class. Then, the created NormalizingFlow model performs two methods:

  • Sampling:

    • Receives as a parameter the number of samples (int) to compute.
    • Returns a tuple (samples (n_samples, dim), log_prob(n_samples)). The sample object is a torch.Tensor of N points after transforming the base_distribution according to layers. Log_prob is the logarithm of the jacobian determinant (deviation from between the base and transformed distributions)
  • Log_prob:

    • Receives a tensor as input. This tensor will be evaluated with the inverse pass (this is very important for the optimization loop when given a target distribution)
  • Documentation and test

    • Add an example of usage performing an Affine Transformation.
    • Update format according to numpy style.
    • Type annotations
  • To do:

    • Improve the test by changing the assert condition.
    • Include Transform class to perform forward and inverse passes the transformation layers.
    • Add type hints to base_distribution parameter