OK, I have parsed my dataset from the website of origin and here are the preliminary results:
I got molfiles and loaded them as Mol objects.
Then I converted those Mol objects to SMILES strings.
Then I used the ConvMol featurizer and HERE IS THE THING:
I’m not finding the expected number of atoms with the get_num_atoms
method.
My current thought is that conversion from Mol objects to SMILES strings must be handled more delicately OR loading of molfiles must be handled more delicately OR the ConvMol
featurizer must be applied more delicately than I have done.
Has anyone else encountered such issues before?
Thanks for your attention!