Graph representation

Hello everybody! This is my first post here, just a brief presentation:
my name is Gianmarco, I’m Medicinal Chemistry undergraduate student who is preparing his dissertation, my idea would be to create a VAE or a GAN capable of generating new drugs, using graphs as representations for my molecules. Now I’m asking the real question:

I started the project with a simple Pandas dataframe made up of SMILES strings and various features, like this one:

  • CC(=O)Nc1ccc(O)cc1, weight = 151.16, …
  • CC(=O)Oc1ccccc1C(=O)O, weight = 180, …

Is it possible to convert the strings in a graph data format? If yes, may you give me some suggestions on how to do that?

Thank you all! Giammy98

Yes for sure. Check out some of the featurizers in deepchem.feat. You might be interested in the tutorial on normalizing flows for generating new structures:

1 Like