Using a preprocessed version of QM9 with DeepChem's models

Hello folks! I’m Clara, a new member to this incredible community. I have a background in chemistry and currently work as a software engineer. I’m working on a personal project using deepchem, where I aim to predict HOMO, LUMO, and gap properties. For this, I am using the QM9 dataset.

I have preprocessed the QM9 dataset to remove molecules with valency issues and inconsistent InChI values. Now, I want to use DeepChem’s GTNN and MultitaskFitTransformRegressor models with my custom version of the QM9 dataset.

Is there a way to use my preprocessed QM9 dataset instead of the original QM9 dataset from MolNet? I noticed that the DeepChem repository always uses the original QM9 dataset (https://github.com/deepchem/deepchem/blob/master/examples/qm9/qm9_tf_model.py).

Thank you in advance! And feel free to connect with me if anyone is interested :slight_smile: