Hello.
I have been trying to encode and featurize my own data through the one hot encodings which are done when creating a weave mol. However I am running into issues when I try to fit a model given the dataset I provide from my own weavemols.
My data is molecules as graphs with custom values on the edges (similar to distances) I have tried to create something similar to the weavefeaturizer, and am succesful most of the way.
The problem is encountered when I try to feed this data to a MPNNModel as the response I get is this:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-72-6a0b50c0b8f0> in <module>
----> 1 model.fit(train, nb_epoch=10)
~/.local/lib/python3.8/site-packages/deepchem/models/keras_model.py in fit(self, dataset, nb_epoch, max_checkpoints_to_keep, checkpoint_interval, deterministic, restore, variables, loss, callbacks, all_losses)
318 The average loss over the most recent checkpoint interval
319 """
--> 320 return self.fit_generator(
321 self.default_generator(
322 dataset, epochs=nb_epoch,
~/.local/lib/python3.8/site-packages/deepchem/models/keras_model.py in fit_generator(self, generator, max_checkpoints_to_keep, checkpoint_interval, restore, variables, loss, callbacks, all_losses)
395 # Main training loop.
396
--> 397 for batch in generator:
398 self._create_training_ops(batch)
399 if restore:
~/.local/lib/python3.8/site-packages/deepchem/models/graph_models.py in default_generator(self, dataset, epochs, mode, deterministic, pad_batches)
1159 # pair features
1160 pair_feat.append(
-> 1161 np.reshape(mol.get_pair_features(),
1162 (n_atoms * n_atoms, self.n_pair_feat)))
1163
<__array_function__ internals> in reshape(*args, **kwargs)
~/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py in reshape(a, newshape, order)
297 [5, 6]])
298 """
--> 299 return _wrapfunc(a, 'reshape', newshape, order=order)
300
301
~/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
56
57 try:
---> 58 return bound(*args, **kwds)
59 except TypeError:
60 # A TypeError occurs if the object does have such a method in its
ValueError: cannot reshape array of size 6272 into shape (196,3)
I’m sure this has something to do with the shape of my data being something that the MPNN doesn’t expect.
The data basically is a list of the objects weavemol:
- nodes (mostly the same as the weave featurizer)
- pair_edges exactly the same as weave featurizer
- pairs this is where I suspect the problem is as this contains a vector for all edges in the complete graph of the molecule with onehot encodings for my distance measure.
Any input would be appreciated